SANS Digital Forensics and Incident Response Blog

Strings, Strings, Are Wonderful Things

One of the basics of doing forensics involves gathering the ASCII and Unicode strings in the file system and searching for keywords. Using Linux we can gather the strings for both ASCII and Unicode using the strings command.

To Gather the ASCII Strings

# strings -td /dev/sdb > sdb.ascii

Note: The -td in the above line tells strings to print the offset in decimal for the line.

To Gather the Unicode Strings

# strings -td -el /dev/sdb > sdb.unicode

Note: The -el option will have the strings command handle 16-bit little endian encoding. Strings can handle other types of encoding such as 32-bit big/little endian. See the man page on strings and the -e option.

Below is a sample output from the command:

192301896     <member name="F:Microsoft.DirectX.DirectPlay.Address.FlowControlNone">
192301972 <summary>This field is deprecated. Deprecated components of Microsoft DirectX 9.0 for Managed Code are considered obsolete. While these components are still supported in this release of DirectX 9.0 for Managed Code, they may be removed in the future. When writing new applications, you should avoid using these deprecated components. When modifying existing applications, you are strongly encouraged to remove any dependency on these components.Deprecated.</summary>
192302446 </member>
192302461 <member name="F:Microsoft.DirectX.DirectPlay.Address.FlowControlRtsDtr">
192302539 <summary>This field is deprecated. Deprecated components of Microsoft DirectX 9.0 for Managed Code are considered obsolete. While these components are still supported in this release of DirectX 9.0 for Managed Code, they may be removed in the future. When writing new applications, you should avoid using these deprecated components. When modifying existing applications, you are strongly encouraged to remove any dependency on these components.Deprecated.</summary>
192303013 </member>
192303028 <member name="F:Microsoft.DirectX.DirectPlay.Address.FlowControlXonXoff"SZDD

Now that we have the output we can use a variety of tools to search for keywords in the output files. Some examples are:

  • grep -i keyword sdb.ascii > sdb.ascii.keyword

-i tells grep to ignore case. This is a pretty useful option as we do not always know how the keyword will by laid out in reference to case.

  • grep -i -f keywords.txt sdb.ascii > sdb.ascii.keywords

The -f option in the above command allows you to create a keyword file with all of keywords you are looking for.

  • egrep -color -i -f keywords.txt sdb.ascii

Egrep is equivalent to doing a "grep -E". It allows for extend regular expressions, which in itself is another topic. The key thing right now to pick up on the above command is the -color option. This will print any matching keyword in a different color. On my Fedora systems, the keyword is in red. One thing to note about this is, if you pipe egrep output to another command or redirect the output to a file, you will lose the color on matching text. It is a nice command to get a keyword to pop out for doing a quick search.

Offset Math

Sometimes you want to take a closer look at the clusters/blocks for where your keyword was found. Using the offsets listed in the strings output you can quickly figure out where the keyword is in the drive or file. For example:

192303028     <member name="F:Microsoft.DirectX.DirectPlay.Address.FlowControlXonXoff"SZDD

The offset here is 192303028 for our DirectX keyword. For this NTFS file system, the cluster size is 4096 bytes. To figure out which cluster DirectX is in do:

Offset / cluster size or

192303028 / 4096 = 46948.981445312 or cluster 46948

If you wanted the sector where the keyword is located:

192303028 / 512 = 375591.8515625 or sector 375591

Figuring Out Cluster Size

You can use the ntfsinfo command to figure out the cluster size for NTFS file system. To do this use:

# ntfsinfo --mft   /dev/sda1
Volume Information
Name of device: /dev/sda1
Device state: 11
Volume Name:
Volume State: 1
Volume Version: 3.1
Sector Size: 512
Cluster Size: 4096

Volume Size in Clusters: 13181323

In the above output in bold, the command has listed the sector size and the cluster size.

For Linux the block size can be found with the tune2fs command. I have piped it out to grep as the output can be lengthy.

# tune2fs -l /dev/sda2 | grep Block
Block count: 12799788
Block size: 4096
Blocks per group: 32768

Again the block size is in bold.

There you have it, the basics of using the strings command and how to calculate the cluster/block/sector for where the keyword can be found.

Keven Murphy, GCFA Gold #24, is the Senior Forensics Specialist for a Fortune 100 defense contractor.