Blog: SANS Digital Forensics and Incident Response Blog

Blog: SANS Digital Forensics and Incident Response Blog

Google Chrome Forensics

Google Chrome stores the browser history in a SQLite database, not unlike Firefox. Yet the structure of the database file is quite different.

Chrome stores its files in the following locations:

  • Linux: /home/$USER/.config/google-chrome/
  • Linux: /home/$USER/.config/chromium/
  • Windows Vista (and Win 7): C:\Users\[USERNAME]\AppData\Local\Google\Chrome\
  • Windows XP: C:\Documents and Settings\[USERNAME]\Local Settings\Application Data\Google\Chrome\
There are two different versions of Google Chrome for Linux, the official packets distributed by Google, which stores its data in the google-chrome directory and the Linux distributions version Chromium.

The database file that contains the browsing history is stored under the Default folder as "History" and can be examined using any SQLlite browser there is (such as sqlite3). The available tables are:

  • downloads
  • presentation
  • urls
  • keyword_search_terms
  • segment_usage
  • visits
  • meta
  • segments
The most relevant tables for browsing history are the "urls" table that contains all the visited URLs, the "visits" table that contains among other information about the type of visit and the timestamps and finally the "downloads" table that contains a list of downloaded files.

If we examine the urls table for instance by using sqlite3 we can see:

sqlite> .schema urls
CREATE TABLE urls(id INTEGER PRIMARY KEY,url LONGVARCHAR,title LONGVARCHAR,visit_count INTEGER DEFAULT 0 NOT NULL,
typed_count INTEGER DEFAULT 0 NOT NULL,last_visit_time INTEGER NOT NULL,hidden INTEGER DEFAULT 0 NOT NULL,
favicon_id INTEGER DEFAULT 0 NOT NULL);
CREATE INDEX urls_favicon_id_INDEX ON urls (favicon_id);
CREATE INDEX urls_url_index ON urls (url);

And the visits table

sqlite> .schema visits
CREATE TABLE visits(id INTEGER PRIMARY KEY,url INTEGER NOT NULL,visit_time INTEGER NOT NULL,from_visit INTEGER,transition INTEGER DEFAULT 0 NOT NULL,segment_id INTEGER,is_indexed BOOLEAN);
CREATE INDEX visits_from_index ON visits (from_visit);
CREATE INDEX visits_time_index ON visits (visit_time);
CREATE INDEX visits_url_index ON visits (url);

So we can construct a SQL statement to get some information about user browser habit.

SELECT urls.url, urls.title, urls.visit_count, urls.typed_count, urls.last_visit_time, urls.hidden, visits.visit_time, visits.from_visit, visits.transition
FROM urls, visits
WHERE
urls.id = visits.url

This SQL statement extracts all the URLs the user visited alongside the visit count, type and timestamps.

If we examine the timestamp information from the visits table we can see they are not constructed in an Epoch format. The timestamp in the visit table is formatted as the number of microseconds since midnight UTC of 1 January 1601, which other have noticed as well, such as firefoxforensics.

If we take a look at the schema of the downloads table (.schema downloads) we see

CREATE TABLE downloads (id INTEGER PRIMARY KEY,full_path LONGVARCHAR NOT NULL,url LONGVARCHAR NOT NULL,
start_time INTEGER NOT NULL,received_bytes INTEGER NOT NULL,total_bytes INTEGER NOT NULL,state INTEGER NOT NULL);

And examine the timestamp there (the start_time) we can see that it is stored in Epoch format.

There is one more interesting thing to mention in the "visits" table. It is the row "transition". This value describes how the URL was loaded in the browser. For full documentation see the source code of page_transition_types or in a shorter version the core parameters are the following:

  • LINK. User go to the page by clicking a link.
  • TYPED. User typed the URL in the URL bar.
  • AUTO_BOOKMARK. User got to this page through a suggestion in the UI, for example,through the destinations page
  • AUTO_SUBFRAME. Any content that is automatically loaded in a non-toplevel frame. User might not realize this is a separate frame so he might not know he browsed there.
  • MANUAL_SUBFRAME. For subframe navigations that are explicitly requested by the user and generate new navigation entries in the back/forward list.
  • GENERATED. User got to this page by typing in the URL bar and selecting an entry that did not look like a URL.
  • START_PAGE. The user's start page or home page or URL passed along the command line (Chrome started with this URL from the command line)
  • FORM_SUBMIT. The user filled out values in a form and submitted it.
  • RELOAD. The user reloaded the page, whether by hitting reload, enter in the URL bar or by restoring a session.
  • KEYWORD. The url was generated from a replaceable keyword other than the default search provider
  • KEYWORD_GENERATED. Corresponds to a visit generated for a keyword.
The transition variable contains more information than just the core parameters. It also stores so called qualifiers such as whether or not this was a client or server redirect and if this a beginning or an end of a navigation chain.

When reading the transition from the database and extracting just the core parameter the variable CORE_MASK has to be used to AND with the value found inside the database.

CORE_MASK = 0xFF,

I've created an input module to log2timeline to make things a little bit easier by automating this. At this time the input module is only available in the nightly builds, but it will be released in version 0.41 of the framework.

An example usage of the script is the following:

log2timeline -f chrome -z local History

...

0|[Chrome] User: kristinng URL visited: http://tools.google.com/chrome/intl/en/welcome.html (Get started with Google Chrome) [count: 1] Host: tools.google.com type: [START_PAGE - The start page of the browser] (URL not typed directly)|0|0|0|0|0|1261044829|1261044829|1261044829|1261044829

...

0|[Chrome] User: kristinng URL visited: http://isc.sans.org/ (SANS Internet Storm Center; Cooperative Network Security Community - Internet Security) [count: 1] Host: isc.sans.org type: [TYPED - User typed the URL in the URL bar] (directly typed)|0|0|0|0|0|1261044989|1261044989|1261044989|1261044989..

The script reads the user name from the directory path the history file was found and then reads the database structure from the History file and prints out the information in a human readable form (this output is in mactime format). To convert the information found here in CSV using mactime
log2timeline -f chrome -z local History > bodyfile

mactime -b bodyfile -d > body.csv

And the same lines in the CSV file are then:
Thu Dec 17 2009 10:13:49,0,macb,0,0,0,0,[Chrome] User: kristinng URL visited: http://tools.google.com/chrome/intl/en/welcome.html (Get started with Google Chrome) [count: 1] Host: tools.google.com type: [START_PAGE - The start page of the browser] (URL not typed directly)
Thu Dec 17 2009 10:16:29,0,macb,0,0,0,0,[Chrome] User: kristinng URL visited: http://isc.sans.org/ (SANS Internet Storm Center; Cooperative Network Security Community - Internet Security) [count: 1] Host: isc.sans.org type: [TYPED - User typed the URL in the URL bar] (directly typed)

 
Kristinn Guđjónsson, GCFA #5028, is the author of log2timeline and several other scripts as well as being the team leader of information security at Skyggnir, forensic analyst, incident handler an a local mentor for SANS.

15 Comments

Posted November 09, 2010 at 1:30 AM | Permalink | Reply

Keith Orlovsky

Hello, is there a specific file where all of this data is saved? Like for instance, a .dat or is it comparable to Mozilla's .sqllite?

Posted November 09, 2010 at 4:05 PM | Permalink | Reply

kristinn

The database file that contains the browsing history is stored under the Default folder as History and can be examined using any SQLlite browser there is (such as sqlite3).
So the file is called History that stores the data... and it is a SQLIte database, just like Firefox uses.

Posted November 25, 2010 at 2:24 PM | Permalink | Reply

Mike Roszko

What exactly is the Google time format used in the "last_visit_time" parameter of the "urls" table. It is not epoch time. The only information I have found so far is that it is Google's own variation of Windows filetime? How do I convert this to UTC?

Posted November 27, 2010 at 8:21 AM | Permalink | Reply

kristinn

The timestamp is stored in the Webkit format, as the number of microseconds since midnight UTC of 1 January 1601. So to transform the date value from Webkit to Epoch you need to do the following conversion:

round( ( $last_visit_time - 11644473600000000 ) / 1000000 ) )

Posted July 08, 2011 at 6:07 PM | Permalink | Reply

Shawn Hughes

Hey, I stumbled across this in a googling effort to answer a little question I had. I was viewing the Top Sites DB file with SQLite Database Browser. I noticed that there was a BLOB entry for thumbnail. I viewed the contents of this and it came up as ???? in the DB viewer. I figured the program didn't know how to handle the special characters so it spit that text.

Interestingly enough it lists MANY different entries for the top sites, just not the 8 you see in the browser, and through some automated/manual carving I was able to pull out some thumbnail images.

So two questions for ya,

Do you know where the thumbnails are all stored, and in what format? The ones I carved were .jpg's and the DB file is littered with yoya/yu's but I was not able to find the thumbnails that were my main 8 for my browser.

Do you know of a browser that will better view that thumbnail BLOB record? I have a feeling if I can get that to properly display I will be able to see exactly how chrome is storing these thumbnails!

Thank you for your input :)

Posted July 11, 2011 at 10:27 AM | Permalink | Reply

kristinn

Hi
So the thumbnails are stored inside the database format.... try using something like sqlite3, command line sqlite browser... and use the dump command... something like this:
<strong>echo '.dump' | sqlite3 Top\ Sites &gt; top.txt</strong>

If you look inside the text file that gets dumped you will see familiar hex representation of a JPG for the thumbnail itself:
<em>INSERT INTO "thumbnails" VALUES('http://log2timeline.net/',7,'log2timeline',X'FFD8FFE000104A46494......</em>

So the blob value is just the JPG in a binary format.... you can also look at the schema of the table:
<strong>sqlite&gt; .schema thumbnails
</strong>
<em>CREATE TABLE thumbnails (url LONGVARCHAR PRIMARY KEY,url_rank INTEGER ,title LONGVARCHAR,thumbnail BLOB,redirects LONGVARCHAR,boring_score DOUBLE DEFAULT 1.0, good_clipping INTEGER DEFAULT 0, at_top INTEGER DEFAULT 0, last_updated INTEGER DEFAULT 0);
</em>

And there you can see that inside the thumbnails table is a column called thumbnail which is the BLOB that stores the JPG picture or the thumbnail itself.....

Posted September 23, 2011 at 7:54 AM | Permalink

Sameera

Hi There, I am a novice with Google chrome browser and stumbled upon your site while trying to access some of chrome's old files via SQL Lite. I have a snapshot of old Google chrome files from last year. I would like to access the thumbnails stored as part of this file set. The Thumbnail db exists and in sqllite I can see: database structure, the url data along with the image type (PNG). But how do I extract these thumbnail images and view them? Appreciate any help you can offer Regards, Sameera

Posted July 18, 2011 at 6:53 PM | Permalink | Reply

liran

Hi there,
I tried connecting to Chrome Db using some Java modules,
I got an error message that actually means that the DB is locked with an exclusive lock while the chrome app is running.
Is there any way connecting to the history DB while Chrome is running (without copying the History file )?
Thanks alot!

Posted November 04, 2011 at 2:34 PM | Permalink | Reply

SD

This is a great post on Chrome, one of the few places I have found information on parsing it out! Just a slight note, but if you are trying to use sql statements in sqlite3, they won't work unless terminated with a semicolon. I found this out the hard way :)

Posted December 27, 2011 at 6:48 PM | Permalink | Reply

Brad

[Question???]

Does anyone know why Google Chrome history data in the user data/default/history directory keyword_search_terms table isn't showing all the queries that are listed in the urls table?

Is this a result of the keyword_search_terms table only caching data searched from the address bar and not from the google web page itself?

Hmmm... I guess I can create a VM from the image and test this these actions but am looking for a shorter easier route from someone who has tested this scenerio.

Thanks

Posted May 06, 2012 at 9:37 PM | Permalink | Reply

Max

Kristin,
Not sure if this is still an active thread, but wanted to compliment you on the work, and the additional reference materials. I'm in the process of updating our Internet Forensics course, and including a Google Chrome, Google Plus, and Google Talk module. Your "Google Chrome Forensics" work was a great jumping off point. I concur that the Chromium code is a real challenge. And, after 3 days of reading many of the chromium.org developer posts, my head hurts. I'm glad I didn't grow up to be a programmer! Thanks again for all the great work! ~ Max

Posted December 05, 2012 at 9:36 AM | Permalink | Reply

Shane

Not sure if this will even read after 3 years of life..but here goes.

Ref the history SQL statement in the OP, it can be modded a bit to get some human readable output from the timestamps. As already noted, the timestamps are in Webkit format, so needs some tweaking to get something intelligible.

SELECT urls.url, urls.title, urls.visit_count, urls.typed_count, datetime(((urls.last_visit_time-11644473600000000)/1000000),'unixepoch','localtime') as last_visit_time, urls.hidden, datetime(((visits.visit_time-11644473600000000)/1000000),'unixepoch','localtime') as visit_time, visits.from_visit, visits.transition FROM urls, visits WHERE urls.id = visits.url

Looks scary...but all we are doing is subtracting a bunch of seconds, which have been further divided by a million from the timestamp. We then use 'unixepoch' to convert the number into human readable format. If you want the timestamp in UTC, then omit the 'localtime' modifier. 'localtime' will apply the offset to the timezone which is set on the computer you run the query on..so any inaccuracies in the local system settings will be reflected in the results..be warned.

Yes, it's missing the rounding, and I'm not sure what difference it makes, because the dates come out the same with or without it. (Kristinn?)

Ref the September 23, 2011 post (the image blob data), I was experimenting with automatic database discovery and parsing using apache, php, perl and a bit of javascript (sounds awful, but worked really well), and a pleasant side effect of displaying the image blob data was, by simply detecting the JFIF or PNG string in the first 8 chars of the blob, I send the data to an img tag... (replace jpg with png to suit). Instead of a blob of characters, you get the actual image rendered. This is real handy when you can render the large thumbnails you see when you start Chrome and view the "Most Visited" panel, or see the bookmark icons used in the bookmarks bar.

Hope someone finds this useful..and thanks Kristinn for the great blogs. A mine of information.
Cheers.

Posted February 12, 2013 at 4:45 PM | Permalink | Reply

melipone

The link for the transition type codes does not work. Can you fix it, please?

Posted May 17, 2013 at 12:52 PM | Permalink | Reply

Sara

I get the format of the time stamps, but my question relates to where Chrome is getting the time. Is it setting the time from the remote server, or the local machine.

Looking at data I'm examining, my gut tells me it's getting the time remotely - not from the local system, but I wanted to know if any one has confirmed this.

Thank you.

Posted July 30, 2013 at 1:39 AM | Permalink | Reply

Bill Pytovany

This is great information but I'm still having a brain fart of something has changed with sqlite3.

I'm trying to get the default start page and/or search page
Basically, I want to help protect people from the same companies who install unwanted toolbar and change peoples start page.
IE keeps the information in the registry but obviously Chrome would keep it in one of the databases. I just haven't found it yet and I want to be lazy tonight. :).

So I'm willing to offer a trade and share my code for getting
Chrome and Firefox cookies. If I'm not allowed to include links in a comment I understand and you can just delete this post.
http://billpstudios.blogspot.com/2009/09/sqlite-c-code-to-read-cookies.html

Thanks,
Bill

Post a Comment






Captcha

* Indicates a required field.