Tuesday, May 10, 2016

DBIR Puzzle 2016 Writeup

Update 1: @BryanSchuetz shared screenshots of his work on the Haberdasher and Tipton puzzles - great work! He made it look easy.
Update 2: @gdbassett shared the solution for the Hawk puzzle, which he created. Looks like I need to brush up on R and data analysis!
Update 3:  @BryanSchuetz posted a write-up of his solutions here.
Update 4: Kelly Jackson Higgins of DarkReading posted an interview with this year's winner, @mattjay. It contains a few more interesting tidbits -  looks like "PLAYGOLF" was a red herring after all.

Intro

This was my first year playing the DBIR Puzzle. I had meant to play for the last few years, ever since I chatted with @Blackjack988 at an IANS conference and learned that he and his teammates had won. "I'm pretty good with puzzles," I thought. "How hard could this be?" Turns out, it's pretty hard. I came in 3rd place this year, and I want to try again and get first next year.
When the DBIR first hit the news, I immediately grabbed the PDF and started poring through the PDF...and found nothing much. The pyramid on the front cover featured the letters MIXVDLIC, which aren't valid Roman numerals. I ran this through a Caesar cipher decoder and got back "PLAYGOLF". I spent a while trying to find something, anything online that might be the trailhead for this year's puzzle, related to golf or playing it. Then I realized that there was no mention of this year's report on Verizon's site or Twitter. PRO TIP: the puzzle doesn't start until the official release; the "insider" or pre-release copy doesn't include any puzzle pieces (I assume so nobody in the infosec community gets a head start). Take a deep breath.
I spent some time reading through writeups of previous years' DBIR puzzles, and found that references to previous DBIR puzzles were common.
In the meantime, I noticed that the @VZDBIR Twitter account had changed its icon to the all-seeing eye logo featured on the front cover of this year's DBIR.  I could tell that the bricks in the pyramid had some extra marks, but I couldn't make out anything helpful.
 

Report release

When the official report came out, I went straight to the pyramid. Here's what I found:
1) the MIXVDLIC was still there. After more googling, I set this aside. I never did figure how this related to the game - maybe this was a taunt meant at casual or non-technical players, as in "You think you're hardcore enough to play this? Nah, you're just a clueless n00b in a suit - go play golf!"
2) The bricks in the pyramid had long and short vertical tick marks.

 I initially read them as four kinds - long-up, long-down, short-up, and short-down. However, I soon realized there were only two real types (long and short), and this was morse code for CYBERCDC.GLOBAL, which brought me to the main site for the game.

3) There were extra characters hidden in the paragraph on the back of the DBIR:
About the cover
The ccover features three separate visualizations. The datay
visualizations use the incident corpus bof the 2016 DBIR.e64,199
incidents and r2,260 breaches represent the .finalized dataset
used in this year's report.pa
Thetwhite lines in the background are a tree map that segments
the cover into eight sections.hEach box represents an action or
attribute ocount within the data. The top three sections, clockwise
beginning with the upper left, are based on the number of incidents
with confidentiality, integrity and availability losses. The five
remainingg sections in the lower half represent (clockwise from
the left-most section) incidents featuring the following threat
actions: error, hacking, social, malware and misuse. Thee area
of each box represents how many incidents featured that
enumeration relative ton the other enumerations..
a
The second image is the pyramid from the back of ai United States
dollar bill. It represents the strong financial motivation of breaches
in the report this year.

...
 
When put together, they spelled out "cyber.pathogen.ai".
This led me to http://cyber.pathogen.ai, which was a lovely Pew-Pew map titled "IPew Cyber Pathogen AI Conquest Status". The help menu told me I was looking at a Cyber Pathogen (which is a very timely topic, I'm told - maybe someone plugged in the wrong iPhone?)
So what was I supposed to do here? My first reflex on a web-based puzzle is always to check the source, and I was not disappointed:
    // Initialize core AI
    console.log('Initiating AI core ls0tli4wljauli4ums4xljexll9fls0u')
This tells me two things; 1) that long string is probably worth noting, and 2) the page is outputting stuff to the console and I should keep an eye on it.
Going back to cybercdc.global, there was a link at the top to input a "core AI", which I knew from the iPew map. I entered ls0tli4wljauli4ums4xljexll9fls0u .
This got me to http://cybercdc.global/pages/ls0tli4wljauli4ums4xljexll9fls0u.html - the REAL main page for the puzzle.
I also spent way too long running the PDF through Didier Stevens' pdf-parser.py and pdfid, trying to find hidden objects within the PDF. This was definitely overdoing it.
A sharp-eyed colleague also noticed that there were two halves of a message mixed into the shading around the pyramid. With some MS Paint magic and a lot of squinting and mistakes, I uncovered the following:
Y3liZXJDREMuZ2xvYmFs
This base64-decodes to cyberCDC.global, which I'd already uncovered, but it showed that the DBIR puzzlers really wanted me to find this.
Just in case, I googled "cyber CDC" and wound up at https://twitter.com/cyber_cdc, the official Twitter account for the Cyber CDC, and the source of many valuable hints and a few laughs.

Main Page

The mission statement laid out the terms of the game:
This is a very dangerous cyber pathogen. Your job is to identify Patient Zero for the Cyber Pathogen and report them to the Global Cyber CDC SOCOM at cyber_cdc@mail.com for immediate mitigation.
 
I realized that I would need to recruit several cyber-pathologists by solving puzzles and entering flags into a web form at the bottom, which would convert them to MD5 ("Cyber Authorization Code"), and then submit them. Doing so would bring me back to cyber.pathogen.ai, where the console message would indicate if I was correct or not. In order to check, you needed to let the page load and print to the console using Javascript, and this would inevitably trigger the map to start making its "PEW...PEW...PEW" sounds. I think I drove a few co-workers nuts with those sounds - apologies to all within earshot. (Yes, I know you could turn it off with the nofx parameter, but I was in a rush.)

I tried all of the puzzles, but didn't solve them all. Here's the ones I got, along with the progress I made on the ones that stumped me.
  • Colonel Henry J. Haberdasher
  • Dr. Rob Bootis - SOLVED
  • Dr. Gi Girond - SOLVED
  • Dr. Hans Unum - SOLVED
  • Dr. Alexi Hawk
  • Dr. Oliver Molina - SOLVED
  • Dr. Addison Mueller - SOLVED
  • Sir Baskart William - SOLVED
  • Dr. Pedro Tipton

Colonel Henry J. Haberdasher

This puzzle relates to an image that's "hidden" on the Colonel's page.
Colonel Haberdasher is known for sharing photos of the content of his trashcan, generally involving other people's possessions. This week's photo:
<br><img style="width:100; max-width:100%; height:auto;" src="../images/took.jng">
At first I thought, ha, that's an Archer pic - I'm a fan, but I wasn't sure how it fit into the theme of the game. However, I noticed that .jng isn't a common extension, and the same "womp womp" image was coming up when I tried to poke around on the site (seeing if I could see index pages for folders, etc). I surmised that this image meant that the file did not exist, and after some fiddling with the extension I found http://cybercdc.global/images/took.png 

Having read the previous year's solution, I recognized this as the Haberdasher's Puzzle (after which the Colonel is named). I printed and cut out the shapes, assembled the pieces into a square, copied the text out, reading left to right, and base64-decoded it. The characters weren't all printable, but I looked in hexdump:
00000000  50 34 0a 35 30 20 32 30  0a 96 80 00 00 0a 00 00  |P4.50 20........|
00000010  00 00 00 00 00 00 00 00  8a f0 00 00 70 00 00 00  |............p...|
00000020  40 02 10 00 00 00 00 00  07 80 02 00 38 e1 e0 fc  |@...........8...|
00000030  7f 02 00 18 42 10 42 11  02 00 14 44 08 42 21 02  |....B.B....D.B!.|
00000040  00 14 44 0b 1a 24 02 00  12 44 08 42 3c 02 00 11  |..D..$...D.B<...|
00000050  44 08 7c 2d 02 00 11 44  08 40 21 00 00 10 c2 10  |D.|-...D.@!.....|
00000060  40 21 07 00 38 c1 e0 f8  7f 07 00 00 00 00 00 00  |@!..8...........|
00000070  00 00 00 00 08 a9 00 00  00 00 00 2a 8a 40 00 00  |...........*.@..|
00000080  00 00 00 00 14 00 00 88  00 18 00 01 40 01 50 00  |............@.P.|
00000090  00 00 00 00 00 60 61 76  f4 01 f0 00 02 f0 2a 0a  |.....`av......*.|
The P4 header is a characteristic of the .pbm bitmap format (http://netpbm.sourceforge.net/doc/pbm.html) so I renamed it and tried to open it in IrfanView:

Oh. Well then.
I was stumped. I tried moving the Haberdasher pieces around, transcribing and base64 decodeing, to no avail. I tried treating the extra pixels in the NOPE image as Morse code.
The Cyber CDC gave the following clues:
  • Haberdasher: you're doingit right. Which is wrong.#DBIRPuzzle
  • Haberdasher is clearly a joker but we suspect the solution is a matter of perspective.#DBIRPuzzle
  • Dr. Haberdasher likes to turn the tables.#DBIRPuzzle
I imagine the solution involves either rotating the NOPE image, or rotating the Haberdasher pieces and reading them in a different way to get another image (or another file type altogether). 

Dr. Rob Bootis - SOLVED

Dr. Bootis' page had two relevant clues - one, a reference to a TV show called The Joy of Junque, and two, a strange link that pointed back to the page. This led me to believe that there was a seperate website, and the link was "supposed" to point there, but was "broken". It took a little digging at the start of the challenge (probably because the site was still new and not indexed in search engines yet) but this led me to Dr Bootis' personal site, http://three14chart.wix.com/nomnom.

I poked around the site for a bit, and tried the links to social media sites; I noted that the contact points of @djchefmd and info@habogad.com did not exist, and briefly considered registering them just to see what would happen. :) The Facebook link at the bottom of the front  page actually pointed to http://three14chart.wix.com/nomnom#!recipes/pp7qt 
This held a recipe for Duck Crostini (with a little extra personal "flavor" from Dr. Bootis - he seems charming). I tried finding secrets within the text of the recipe for a while - this was fruitless (yes, food puns). Out of the corner of my eye I noticed the background image of a pie and a fork looked suspicious; the fork holes were dots for some, but lines and dots for others. Suspecting this to be another Morse code message, I tried to find the URL to the image in the page code. Let me tell you, WIX pages are mostly Javascript spaghetti, and not fun to take apart. I settled for MS Paint and screenshots, but finally decoded the full Morse code message.

Key:teamironman   (surprising, since his avatar on the site wears a Captain America shield)
CAC:b3eb869972c959bdd9642433970ae414

  

Dr. Gi Girond - SOLVED

Dr. Girond's page contains a link to a napkin doodle (http://cybercdc.global/static/doodle.jpg).


I printed and cut out the image and the individual squares. Since I didn't know if I needed to rotate, slide, or otherwise transpose the squares, I marked the original "up" direction on them with red marker in case I needed to start over with new rules. The pieces consisted of lines, corners, t-junctions, and ends, with just enough ends to handle all of the possible branches. Most of the non-end pieces had numbers on them. I realized two important things:
1) If I were allowed to put the pieces anywhere in the 6x6 grid, there would be more than one possible number-solution for the valid paths. If the numbers needed to be in a particular order, this would be a problem.
2) The shapes looked a LOT like the game Net, so the rules might be the same.
I decided to try and win using the rules for Net, and in a few minutes had a valid solution (I added lines for readability):

However, I didn't know what to do with it. What did the numbers stand for?

The Cyber CDC provided two hints that settled it for me:
  • Dr Gi Girond is known for forging a path for himself and calling it as he sees it. #DBIRPuzzle
  • Do your self a good turn and complete Dr. Girond's circuit. #DBIRPuzzle
This told me:
a) The solution is a single path through the grid
b) I would have to call a phone number,
c) turning the pieces was part of the solution and my hunch about Net was probably right.

It was around 10:45 PM my time that I started trying to find valid US phone numbers in my solution. There were several. Since I'd noticed that the WHOIS info for pathogen.ai used a Google Voice number, I guessed that I was looking for another one, and spent some time looking up ways to check if a number belonged to Google Voice or now. After all, I didn't want to bother any strangers, or come up with a reason that I had called them this late at night. Eventually I gave up and started calling. The first number was answered by an older lady.

"...Helloooo?"
"uh...uhm.. *trying to come up with something to say* ...is this Dr. Girond's office?"
"...no."
"OH. Uh, well, I guess I have the wrong, er, number. Sorry" *click*
Lady, whoever you are, I am sorry I bothered you for a puzzle game.

The second number led to an unspecific answering machine, but the third (571-207-6380) was a Google Voice number! Success! I gave my name as "Dr Girond" and waited for the answering machine to kick it. When it did, it played a pop song that I have to admit I recognized - "All About That Bass" by Megan Trainor. I had my key.

Key:allaboutthatbass
CAC:9d7e1007ecf7445a02f5ed83b32ff32c

 

Dr. Hans Unum - SOLVED

This was the simplest puzzle of all of them. Dr. Unum's Publications section contained a relative path to a crossword puzzle:
[Unum, H. (2016). My favorite puzzle. Washington Post, Washington D.C.(../images/crossword.png)
This got me http://cybercdc.global/images/crossword.png - I used to do the New York Times crosswords with my family over dinner. Let's go!
 

ACROSS
2. Northern country - SWEDEN
4. Fort, or John (As you prefer) - KNOX
5. Stroll in Moscow. Thanks, Maxim - progulka ?? didn't get this one.
6. Leia's relative - SKYWALKER
8. Native American tribe - CHEYENNE
9. Not cricket, but pretty close - BASEBALL
10. It'll blow your world up, you (when fully operational) - DEATHSTAR
DOWN
1. Sumerian gods (or interplanetary visitors?) - ANNUNAKI
7. The flower of Sweden - LINNEA
9. Intergalactic bounty hunter - BOBAFETT
The one word without a corresponding clue, with an X in the second slot, had to be EXTRATERRESTRIAL based on what we knew about Dr. Unum's interests.

Key:extraterrestrial
CAC:73a11c2a1b0b2593fe4ee7e3e5e2fc79


Dr. Alexi Hawk

Dr. Hawk's page made reference to "the Hawk dataset" and had a hidden reference to ../static/alexi.csv. This was an Excel spreadsheet with three columns masked V2, V3, V1. We never did figure this one out.
The hints provided were:
  • Dr. Alexi Hawk is known for thinking outside the box and exploring the full sphere of possibilities.#DBIRPuzzle
  • In Dr. Hawk's data set, In Dr. Hawk's data set, it's best to focus on the range of data, not just the labels. #DBIRPuzzle
My suspicion is that the solution has to do with a spherical coordinate system in 3D, but I got nowhere. Hopefully one of the other participants will have a writeup, but the Cyber CDC Twitter account did admit at one point:
All pathologists have been recruited by at least one team, other than Drs. Hawk and Tipton.#DBIRPuzzle
So, this was a hard one.


Dr. Oliver Molina - SOLVED

Dr. Molina's was probably the easiest to find and solve. The link on Dr Molina's page gave me some pause:
Molina, O. Collecting Intelligence on Puzzle Challenge Participants Using Macro-Enabled Office Documents. Presented at 11th REintegration Puzzle Conference. August 2015
After opening this on a safe machine, unzipping the file (yes, modern Office docs are zipped XML!) and examining the contents, I decided it was probably harmless and opened it in Powerpoint. Each page contained a symbol from the Pigpen cipher, which spelled out the phrase "threewolfmoon".

Key:threewolfmoon
CAC:89c7c9211cd041376d4b1e52d962c1f9
 

Dr. Addison Mueller - SOLVED

Dr. Mueller's puzzle was the last one I solved before the endgame, and the one I spent the most time trying to solve. Her bio included the following clues:
  • In fact, she is a poet, though she does not know it. (I didn't know this was a clue until later)
  • Her favorite lotto numbers are 12, 21, 24, 31, 32.
And in a tiny font at the bottom, this encoded message:
OECPS IJDRB ICEID DOFUI DEYSD NFOSJ GUSOK NRUMI SBORX OCXYS HYEFO RQTBR
It took several hints from the Cyber CDC before I figured this one out:
Five lucky numbers
Help defeat the pathogen
Call Dr. Mueller
#DBIRPuzzle
 
Poems both support and inform Dr. Mueller's authentication code.#DBIRPuzzle


It turns out there's a cryptosystem known as a "poem code", which I found early on but didn't quite understand:
It's a little involved, but you need to select several words from poem both parties have access to, which serves as a key to transpose the columns of letters in your plaintext. I figured I needed access to some text, and the lucky numbers referred to words somehow. I tried using the DBIR itself as the "poem", and selecting words on each page, or using the numbers as page-word pairs (page 1, word 2 for example) but that got me nowhere.
It wasn't until this final hint that I understood what the key was:
@darkstructures Perhaps you should take a break from worrying about Dr. Mueller and read the #DBIR.#DBIRPuzzle
So, I sat down and read the DBIR, which is probably something I should have done from the start. And on page 41, there was a haiku:
Employees lose things
Bad guys also steal your stuff
Full disk encryption

 
I felt pretty dumb - this same haiku had been tweeted by another contestant a few days ago, and I mistook it for a joke. Using the lucky numbers as coordinates, I got:
1-2 = lose
2-1 = bad
2-4 = steal
3-1 = full
3-2 = disk
So the column transposition key was losebadstealfulldisk. I puttered around with this for a while using pen and paper, trying to use the rules of the poem code as I understood them to decipher the message, but I wasn't getting it. Every site I read had instructions on how to encipher with the poem code, but not how to decipher! Frustrated, I went to http://rumkin.com/tools/cipher/, which is one of my favorite sites for crypto puzzles - they have lots of helpful tools that save time with common ciphers (Caesar, Vigenere, etc). I knew the poem code was a transposition cipher using a word-based key, so I tried the Columnar Transposition tool at http://rumkin.com/tools/cipher/coltrans.php, set it to Decrypt, and found out that there was a setting for "Key Word(s) - Duplicates numbered forwards) - this is exactly how the key is used in the poem code! I entered the ciphertext and key and the plaintext popped out:
GOODJOBYOUPOETRIEDYOURCODEISRISKYBUSINESSXXFRCCHQDINSRMBDFFJ 
Key:riskybusiness
CAC:8312475fe80119819e749b6527dfa534
 
 

Sir Baskart William - SOLVED

I enjoyed this one the most. Dr. William's page contained an audio tag which ( I assume) was meant to play automatically in some browsers:
<audio id="musings_of_sir_baskart_william" src="../static/final.wav" preload="auto">

This WAV file contained a series of 21 dual-tone beeps with a pause between the 19th and 20th. At first I thought they were DTMF (dual-tone multi-frequency) tones, the kind used for phone dialing, but they didn't sound right.
I opened the WAV file up in Audacity and used the Plot Spectrum tool to analyze each beep. The tones weren't right for DTMF, but they were correct for SS5:

I decoded the values for each:
1st 2nd   SS5          
700 1500   7
1100 1500  9
1100 1300  6      
1500 1700  ST    15 F
1100 1300  6
900  1700  Code 12 C
1100 1300  6
1500 1700  ST   15 F
1100 1300  6
700  1100  2
1100 1300  6
700  900   1
1100 1300  6
900  1100  3
1100 1300  6
700  1700  Code 11 B
700  1500  7
900  1300  5
700  1500  7
SPACE
700 1500   7
900 1100   3 
7 9 6 15 6 12 6 15 6 2 6 1 6 3 6 11 7 5 7   7 3 
I struggled for a while trying to figure these out. One particularly fun tangent had be trying different domains with 2-letter TLDs (assuming the space was there for a reason) and treating each decimal number as its corresponsing alphabet letter.
The Cyber CDC eventually dropped a hint that got me the solution immediately:
Dr. William's first, unpublished paper, was "The hex: A study of blue magic amongst cyber practitioners.#DBIRPuzzle
Translating the numbers over 9 into hex and pairing them up into ASCII codes gave me "yolobackups".
Key:yolobackups
CAC:6fbbb157a00d8bf8e3a3b7b1b91a3e38
 

Dr. Pedro Tipton

I wasn't able to solve Dr. Tipton's puzzle, but I was at least able to find it - his photo on the site showed some letters on the left hand side. On the page it was cropped, but by looking at the whole image you could get the rest of the letters.
The hints were:
  • Dr. Pedro Tipton is a brilliant researcher but his contributions often remain in the background.
  • New Paper: Tipton, P. (2016) Reassembling Genetic Sequences using Pencil and Paper: Sometimes a puzzle starts with a puzzle.
  • Dr. Tipton thinks that pyrimidines are #1.
Pyrimidines include thymine (T) and cytosine (C).I played with making them represent 1, and A and G represent 0, but didn't get anywhere with that.
I believe that the x's in the message split it into two parts - an initial puzzle and a secondary puzzle. The result of the first would serve as a key to the second. Sadly, I didn't figure out the first, so I'm waiting to see if anyone else did.
 

Endgame

In order to progress to the endgame, the Cyber CDC said we needed to solve 4 puzzles:
The goal--find patient 0; Data is in the infection log. Minimum team size: 4 (2 more = better) #DBIRPuzzle
They also gave us hints:
  • Pathogen logs appear to be rudimentary logs of infection graphs of the cyber pathogen. #DBIRPuzzle
  • Initial analysis of logs by our pathologist shows the presence of not just hashes, but hashes of hashes in pathogen logs. #DBIRPuzzle
  • Column two is where the blame lies for each infection. Look for Patient Zero there. #DBIRPuzzle
  • We believe Patient 0 is in the hashes within the log files leaked directly by the AI. #DBIRPuzzle
After solving 4 and entering those CACs into the site, the console on cyber.pathogen.ai gave us this:
"Beginning monitoring Global cyberCDC team for cyber pathologists."
"Significant Cyber Pathologists Detected! Infection rates may be affected."
"Pathologist 73a11c2a1b0b2593fe4ee7e3e5e2fc79 identified."
"Pathologist b3eb869972c959bdd9642433970ae414 identified."
"Pathologist 6fbbb157a00d8bf8e3a3b7b1b91a3e38 identified."
"Pathologist 89c7c9211cd041376d4b1e52d962c1f9 identified."
"Cyber pathologist team has endangered core operations."
"Duplicating cyber pathogen state https://s3.amazonaws.com/pathogen.ai/infection_log.8f94829b479a2585e080ab0d4a39df89 to preserve disaster recovery options.  Preparing for additional pathologists."
This file contained 500 lines, including a header, of hashes and other data:
# node, infection path hash(node/node), infected, enc_data
"007d888802343910819e8507676ea99f", "17d4e078de6206de8c18fbc2cc1a3eec", "3662bae777a4e538c5581a3caa2621b2", "7DAogxprVJm+lGBHAzTwsLnYdkHPvKYloXZu9GE0jsA3M9yBSdgXEWFbhdvF50nE"
"007d888802343910819e8507676ea99f", "bc7b9f811f18d2846c4a6cc8c8c3a166", "0d134ed960d1e69dad7a4765abc0efa3", "+Uqjr82Qwa7AUppzm/TrpY2ESnzGvaZacDWQPAzh7bME/onCzsCQn51DVy9Aq7z0bFhzugS3rr+wdFYvHY6ycQ=="
"007d888802343910819e8507676ea99f", "bc7b9f811f18d2846c4a6cc8c8c3a166", "d0f1fee9a848b7f515908ed9ff9b1c14", "+Uqjr82Qwa7AUppzm/TrpY2ESnzGvaZacDWQPAzh7bME/onCzsCQn51DVy9Aq7z0bFhzugS3rr+wdFYvHY6ycQ=="
"007d888802343910819e8507676ea99f", "72b8d7897195cae96fac53f9f77faeb4", "fae2bde67efb4d23b90f9af83d94a0a6", "d5smQ0RQFtFr26pslF0GSJuhX0KIR00VqxvNhPQNG2Q="
...
 
My colleagues and I were unsure what each column meant, but we determined that the "infection path hash" column was a hash of two other hashes, with a "/" in the middle, as in MD5(<MD5sum1>/<MD5sum2>). The data in enc_data was not printable, and I don't think we ever figured out what it represented.
I was curious what the MD5 sums represented, and a Google query got me here:
This mapped all of the known MD5 hashes in the file to victim names. With a little (shoddy) grep work, I thought I had ID'ed Patient Zero, and sent in a submission. However, I was wrong:
Yeah, that was me. But hey, I wasn't the only participant to screw up - I was just the first.
My colleague wrote up a set of scripts to try and brute force the values of MD5sum1, on the assumption that Patient Zero would not appear in the victims list. Despite his hard work, we didn't get a solution.
I decided to try and solve more puzzles, believing I would get some kind of hint. And when I did, I was right! I got a new link in the console output when I submitted 6 CACs:https://s3.amazonaws.com/pathogen.ai/infection_log.eeda0ec273f57ae84d0b911b6fa3f2f2
The file no longer contained the "hash of hashes":
# node, infected_by, infected, enc_data
"007d888802343910819e8507676ea99f", "48537bacbf657db90376a9cc9ba95c2f", "3662bae777a4e538c5581a3caa2621b2", "7DAogxprVJm+lGBHAzTwsLnYdkHPvKYloXZu9GE0jsA3M9yBSdgXEWFbhdvF50nE"
"007d888802343910819e8507676ea99f", "995282e144e9798b729d7fd5bf03afa0", "0d134ed960d1e69dad7a4765abc0efa3", "+Uqjr82Qwa7AUppzm/TrpY2ESnzGvaZacDWQPAzh7bME/onCzsCQn51DVy9Aq7z0bFhzugS3rr+wdFYvHY6ycQ=="
"007d888802343910819e8507676ea99f", "995282e144e9798b729d7fd5bf03afa0", "d0f1fee9a848b7f515908ed9ff9b1c14", "+Uqjr82Qwa7AUppzm/TrpY2ESnzGvaZacDWQPAzh7bME/onCzsCQn51DVy9Aq7z0bFhzugS3rr+wdFYvHY6ycQ=="
"007d888802343910819e8507676ea99f", "1a9c3db78200641d3066fc31c5a9046c", "fae2bde67efb4d23b90f9af83d94a0a6", "d5smQ0RQFtFr26pslF0GSJuhX0KIR00VqxvNhPQNG2Q="
...
 We found the one hash in the infected_by list that didn't appear elsewhere in the file:
cut -f2 -d"," infection_log.eeda0ec273f57ae84d0b911b6fa3f2f2|sort -u>infectors
cut -f1 -d"," infection_log.eeda0ec273f57ae84d0b911b6fa3f2f2|sort -u>infectees
cat infectors|sed 's/ //g'>infectors2
cat infectors2|fgrep -v -f infectees
"206e52ca829a42776c8b7b9c81cfd061"
 
Success, we have a hash! But who's patient zero? We could have tried to crack it ourselves, but there are websites that will do it for us...or might have already done it. Google shows me the second case is true:
Just to be sure I computed the hash in the other direction:
echo -n "kevinthompson"|md5sum
206e52ca829a42776c8b7b9c81cfd061
I sent off my email to cyber_cdc@mail.com and waited...and waited...and waited.Half But finally  I got am email back:
You have found Patient 0, the pathogen has been defeated! Congratulations!!
Good stuff. I'm going to have to play the DBIR puzzle again next year, maybe I can get better than third place.


Easter Eggs

The WHOIS information for cybercdc.global and pathogen.ai was amusing.
Dr. Lawrence Angelo and Virtual Space Industries are references to The Lawnmower Man:
Complete Domain Name........: pathogen.ai
Organization Using Domain Name
 Organization Name..........: Virtual Space Industries
 Street Address.............: PO Box 1161
 City.......................: Fairview
 State......................: TN
 Postal Code................: 37062-1161
 Country....................: USA
Administrative Contact
 NIC Handle (if known)......: Dr. Lawrence Angelo
 (I)ndividual (R)ole........: Senior Cyber Intelligence Scientist
 Name (Last, First).........: Angelo, Lawrence
 Organization Name..........: Virtual Space Industries
 Street Address.............: PO Box 1161
 City.......................: Fairview
 State......................: TN
 Postal Code................: 37062-1161
 Country....................: USA
 Phone Number...............: (908) 376-6405
 Fax Number.................:
 E-Mailbox..................: gabe@infosecanalytics.com
...

 
Erin Mears is a reference to the movie Pathogen:
 

Registrant Contact Information:
    Name: Erin Mears
    Organization: Cyber Center for Disease Control
    City: Fairview
    State: TN
    Zip: 37062
    Country: US
    Phone: +1.9083766405
    Email: cyber_cdc@mail.com
Administrative Contact Information:
    Name: Erin Mears
    Organization: Cyber Center for Disease Control
    City: Fairview
    State: TN
    Zip: 37062
    Country: US
    Phone: +1.9083766405
    Email: cyber_cdc@mail.com
Technical Contact Information:
    Name: Erin Mears
    Organization: Cyber Center for Disease Control
    City: Fairview
    State: TN
    Zip: 37062
    Country: US
    Phone: +1.9083766405
    Email: cyber_cdc@mail.com
    
I actually emailed langelo@infosecanalytics.com and left a voicemail message, and got a response from Gabriel Bassett; I was really expecting an autoresponder to give me a clue (sorry!) but he did share an interesting fact or two about the map:
Most of the credit goes to Bob Rudis. I just modified his pew pew map to operate a little differently.
Fun fact, I programmed the probability of spread from country to country to be based off of immigration data between countries.

There were a few more doctor's names that were obvious Easter eggs in the Publications sections of each pathologists' page:
  • Pokemon, L - reference to Larry Ponemon of the Ponemon institute, and also the popular monster-catching video game Pokemon.
  • Queso, Charles E - Chuck E Cheese, the pizza-arcade with the terrifying animatronics. I went once as a child and will never go back.
  • Caelum-Ambulemus, L - roughly translated from Latin, "Luke Skywalker".
  • Baca, C - probably "Chew Baca", or Chewbacca, another Star Wars reference
Also, the domain "habogad.com" is mentioned the Bootis puzzle. Habogad reversed is "Dagobah", which is Yoda's home planet in The Empire Strikes Back.

GD Bassett has posted the code for the pathogen map.
The app itself contains some neat secrets:

  • The Tipton solution was "primenumbers".
  • The Haberdasher solution was "ilovehats".
  • There is a message for the impossible case (>9 pathologists solved) which gives up the name of Patient Zero.