SANS Digital Forensics and Incident Response Blog

pdgmail: new tool for gmail memory forensics

I saw John McCash's artical on GMail forensics ... I was hooked and created pdgmail.

I've been messing around with the volatile toolkit for memory forensics and thought I'd try my hands at GMail memory forensics since, as John says, the GMail data isn't supposed to end up on disk anyways, maybe it's in the the browser memory?

Boy is it!

I used the pd dump tool from www.trapkit.de, available here, and tested against my meager GMail account, Windows XP, 2000, IE 6, IE 7 and Firefox 3. In all cases I was able to retrieve contact data, last login times and IP addresses, basic email headers and email bodies. Even if the browser was 'logged out' of GMail, they all still retained this data. Even for messages that were not opened, contacts that weren't used. Simply loading up the GMail UI loads all this data in the memory image.

How to use?

First step is to gather the browser memory. Here's a sample pd session where 6352 is the PID of a running IE instance:

E:\Program Files\tools>pd -p 6352 > 6352.dump
pd, version 1.1 tk 2006, www.trapkit.de

Dump finished.

E:\Program Files\tools>dir
Directory of E:\Program Files\tools

09/27/2008 06:57 PM 117,908,254 6352.dump

Whoa big file! But this is forensics, we don't scare at large data sets. To use the pdgmail tool run this memory dump through strings -el to create a strings file, then either cat that file through pdgmail, or run pdgmail with the -f flag specifying your strings filename. example:

strings -el 6352.dump | pdgmail | less

Best mileage will be with Python 2.4.4 or 2.5 on Linux. I haven't tested it below those versions or on Windows.

It looks for these things:

  • contacts
  • last access records
  • GMail account names
  • message headers
  • message bodies

Contacts show up as:
contact: name: "jeff bryner" email: "myemailaddress@gmail.com

Last Access records show most recent two logins and appear as:
last access: "14 hours ago" from IP "10.15.26.8", most recent access Tue Oct 14 10:57:53 2008 from IP "12.9.4.238"

Email messages are the messiest mostly because memory artifacts don't always conform to API standards, so picking them out is a best guess.

Using the most familiar email of all, headers show up as:
message header: ["ms","113b0d734737dec4","",4,"Gmail Team ","Gmail Team","mail-noreply@google.com",1184082900000,"Did you know that GMail was voted #2 in PC World's Top 100 products of 2005, ...",["^all","^i"]

Message bodies are parsed to turn the unicode into proper html:

Did you know that GMail was voted #2 in PC World's Top
100 products of 2005
, right after Firefox? Why wouldn't you want to
switch? Well, because it can be a pain to switch to a new email
address. We know.

etc...

Nothing fancy, just some glorified regex and unicode handling dumped to stdout. It parses if possible, otherwise it just spits out a familiar line. Feel free to send me patches, tweak, rewrite, etc. Hope it helps someone!

Jeff Bryner , GCFA Gold #137, also holds the CISSP and GCIH certifications, occasionally teaches for SANS and performs forensics, intrusion analysis, and security architecture work on a daily basis.

19 Comments

Posted October 20, 2008 at 12:28 PM | Permalink | Reply

robtlee

Would this also work on a full memory dump from windows created by mdd or win32dd?

Posted October 20, 2008 at 1:24 PM | Permalink | Reply

jeffbryner

Yup, just tested with mdd_1.3,exe and it shows the same info.
It's a bit slower, repeats the findings quite a bit, and hits false positives for the message bodies, but it works.
I'll see if I can refine the message body regex to weed out the falses.

Posted November 6, 2008 at 6:48 PM | Permalink | Reply

johnhsawyer

Jeff, excellent job! I've successfully tested it with memory dumps from mdd, win32dd, Memoryze, a raw export of memory from a winen aquired dump and memory acquisitions using F-Response 2.03 beta.
Rob, when working with a full memory dump from any of the formats that Volatility or Memoryze supports, you could skip the headache of repeats and false positives by first extracting out the process' full memory space and then searching just it.
One cool thing about using the full memory dump, though, is that if a web browser isn't running when you do the memory dump, you'll still find artifacts floating in memory even though you can't extract the process. Volatility's psscan/2 can show that a browser was running and has exited to help clue you into that.

Posted November 8, 2008 at 2:19 PM | Permalink | Reply

jeffbryner

OK, updated the tool to work better in large memory dumps where there is more stuff that smells like a message body. Same link, but now it's version 2.0.
I also added a command line option to skip searching for message bodies (-b or ''"bodies). Use it if you get too many false positives on message bodies.

Posted January 22, 2009 at 4:14 PM | Permalink | Reply

roayers

has anyone used Encase Enterprise to capture the running executables and ram? I tried this today than ran an enscript to convert the executable and the memory capture to separate dd format files. I then tried the above process and here are the results? I used yahoo email prior to the capture of the ram and the iexplore.exe executable so I know Yahoo content existed. I also confirmed this from within Encase during the examination of those files.
Any thoughts?
python pdymail -f ramstrings.txt

Posted January 22, 2009 at 4:17 PM | Permalink | Reply

roayers

Here is the output:
python pdymail -f ramstrings.txt

Posted June 29, 2009 at 9:55 PM | Permalink | Reply

Israel

Do you save pdgmail command line as batch file or executable ? I can see the line when i click "pdgmail" but i am not sure if i need to download the tool itself somewhere. Sorry if this appears like a dump question.
Israel.

Posted June 30, 2009 at 3:45 PM | Permalink | Reply

jeffbryner

@israel: you'll need python (python.org) and then save the script somewhere as a python interpreted script and run via python. The script itself is build for unix (reference to /usr/bin/python) but you can alter that to your python executable if you are a windows type.
@roayers: didn't get your comment until now (usually there is an email) but for pdymail I haven't tried using encase. It would be interesting to see if the steps you took work using some other memory imager?

Posted August 10, 2011 at 12:44 AM | Permalink | Reply

Brandi

I want to recover my deleted "forever" gmail. How do I do this?

Posted September 12, 2011 at 2:53 PM | Permalink | Reply

Tim

This script did not output any gmail addresses. I found them, however, using grep "gmail" against my string file. So there's definitely gmail addresses in my strings file but your script does not find them''.
Thoughts?? I find it hard to believe I'm the only one with this problem.

Posted September 12, 2011 at 3:17 PM | Permalink | Reply

jeffbryner

The script is looking for gmail ''contact records' rather than just gmail.com entries. The records looked like this at the time of writing the script: ["ct","contactname","emailaddress@gmail.com",0,"3''] Do you see anything like that in your memory dump? It could be simply that the format has changed? The regex used is re.compile(''(?:\\[\\"ct.*\\])') which you can alter to fit any new formats you may find.

Posted March 9, 2012 at 4:54 AM | Permalink | Reply

Austin

so this may be an old thread and this may be a noob question but i got it where i can use pd but how to i get my pid number for gmail in chrome or firefox. that would be a great help if u could tell me that, also what exactly did u mean by making a string can u give me steps to do that.. ill take as much info as possible without asking for a full step by step lol

Posted March 9, 2012 at 5:20 AM | Permalink | Reply

jeffbryner

The pid for a process can be found in task manager or via ps -ef on linux. Dump the memory for that pid using pd, then strings -el the dump file you just created and you should be golden.
You are right the post is 4yrs old now..but the process should remain the same.
Good luck!
Jeff

Posted March 9, 2012 at 6:00 PM | Permalink | Reply

Austin

Hey its me again, right after i posted my last question i actually figured it out with a lil trial and error but now im really stumped. since theres not actually a strings command on windows which is what im running how do i go about stringing it? do i need to download a third party command, if so is there any u recommend, or do you have a method for doing this on windows i tried pdgmail.py -f iegmail.dump but it does nothing. so im obviously missing a step, either the string is a must and i need to know how to go about that, or u have another method u substitute for windows. i saw when this was posted it wasnt tested on windows but i know windows users have had success with it i just cant seem to find there method. thanks again for the quick response and any help is greatly appreciated.

Posted March 9, 2012 at 7:30 PM | Permalink | Reply

jeffbryner

http://technet.microsoft.com/en-us/sysinternals/bb897439 is a version of strings build for windows. Alternatively you can transfer the PID dump file to a linux box (or back track VM) and strings' it from there.
Good luck!
Jeff.

Posted March 10, 2012 at 12:32 AM | Permalink | Reply

Austin

that worked, im not completely sure why i found nothing except what was in the inbox, instead of some of my deleted msgs and contacts like i was hoping. but at least i got this far, i appreciate ur help alot. if u have any idea why ^ that would happen plz let me know. i am also trying to figure out how to export all this info into a file i can open outside cmd prompt so i can view it later without having to run it again.
thanks again for all the help you have already provided

Posted March 10, 2012 at 12:39 AM | Permalink | Reply

jeffbryner

Glad you got it!
The tool is a bit old, so it's certainly possible gmail changed their format which would invalidate the tool. If you're brave you could take a shot at updating the regexes in the tool. If what you are after is in the strings'd text file the tool should be able to recover it. If the data isn't in the strings'd text file then it wasn't in memory and the tool doesn't have a chance at recovering it.

Posted March 10, 2012 at 12:47 AM | Permalink | Reply

Lorenzo

Hi Jeff, I am experiencing problems with the tool, in both Linux and Windows environments I cannot get the program parsing the strings output file. After the dump creation and after passed it to the strings command, when I execute "python.exe pdgmail.py -fv > > " I get "Cannot open file". Even copying the file into the python directory didn't help.
In the Linux machine I got a simply empty file after launching the same command.
In the Windows machine I'm using Strings v2.42 and Python27, the Linux machine is a Backtrack 5.
Could you please help me to fix it?
Thank you in advance, Lorenzo

Posted March 10, 2012 at 12:55 AM | Permalink | Reply

jeffbryner

The filename should come just after the -f option. -fv apparently confuses the args library into thinking you're trying to open a file named ''v'. So try it with pdgmail.py -v -f filename