SANS Digital Forensics and Incident Response Blog

Bring Me My Pipe

// photo courtesy of tanakawho at

Often used and under appreciated, the pipe feature in unix/linux/dos has to be my favorite tool in incident response and forensics.

Need the device at /dev/sda imaged with progress indicators and an md5sum?

dd if=/dev/sda| pipebench | tee sda.dd | md5sum >sda.md5.txt

Need a summary of the unique hosts from Internet Explorer's index.dat history file?

pasco index.dat | grep -v 'javascript\:' | egrep -i 'ftp|http' | sort -k 4 | awk '{print $3}' | awk 'BEGIN {FS="/"}{print $2$3}'| sort | uniq | less

No other feature I can think of makes it easier to quickly analyze data. There's even a wiki entry dedicated to pipe 101 exploration.

Here's my top 3 bash shell pipelines I use every day on my gentoo linux workstation:

1) Tailing log files in real-time:
tail -f tabDelimitedNastyBusyLogfile.txt | remark filterfilehere.txt | awk 'BEGIN {FS="\t"} { print $fieldnumberhere,$fieldnumberhere } system("")'
tail -f tabDelimitedNastyBusyLogfile.txt | egrep -vif whitelistfilehere.txt | awk 'BEGIN {FS="\t"} { print $fieldnumberhere, $fieldnumberhere } system("")'

Have a busy syslog file (ASA firewall?) to monitor that you couldn't possibly read fast enough? Piping a file through remark or a grep white list can limit what you see while awk can pick out fields in the output to highlight. Remark can easily color code output which is handy for assigning colors to something like IP netblocks, keywords, etc.

Quick frustration avoidance tip: the system("") at the end of the awk script bypasses the internal awk output buffer so you see output right away instead of waiting for your system to flush the buffer.

Do you use the private IP space? You can quickly get a visual of good IP ranges with a remark entry like:

/172\.21\.([0-9]{1,3}\.)[0-9]{1,3}/g {green }

which will color all your private IP space green, making it easy to pick out whether you are the source or destination at a glance.

Alternatively you can filter out/white list entries with a remark entry like

/Accessed URL/ skip

if you're not interested in those entries, allowing you to hone in on what you're after.

2) while read i
This shell construct:
somecommand | while read i;do somecommandto $i;done

is one of the most useful I've ever found for getting something done quickly to a lot of data. For example, recovering deleted files from an ntfs dd image:

ils -rf ntfs imagefile.dd | awk -F '|' '($2=="f") {print $1}' | while read i; do icat -rsf ntfs imagefile.dd $i > ./deleted/$i; done

A quick way to sort unknown files (maybe those recovered using the command above) by type:

file * | grep -i jpeg | cut -f 1 -d ':' | while read i; do mv "$i" jpegs; done

Not strictly using a 'while' but still useful if you need to quicklyresolve hostnames in an IP range:

for (( i=1; i<=254 ; i++ )) ; do resolveip 10.0.0.$i 2>/dev/null ; done | grep -vi 'Unable'

3) Quick totals
Sort, uniq and head piped together can get you a top 10 quicker than Dave Letterman:
cat filewithlotsofIPAddresses.txt | egrep -oE "([0-9]{1,3}\.){3}[0-9]{1,3}" | sort | uniq -c | sort -rn | head -n10

will get you the top 10 IP addresses in a file sorted by appearances, highest to lowest. Not as funny as Letterman, but hey it's Linux!

No doubt if you've been around the block more than a few times you've got your own pipelines you can't live without. If you're open to it, share them so folks can pick up something new or add to their favorites.

Links to more info:

Jeff Bryner , GCFA Gold #137, also holds the CISSP and GCIH certifications, occasionally teaches for SANS and performs forensics, intrusion analysis, and security architecture work on a daily basis.


Posted September 17, 2008 at 6:09 PM | Permalink | Reply


I am the author of the tool you referenced, Pasco. I wanted to let you know that Pasco is no longer being updated at the site in your posting. Instead, you can get the latest info and updates on any of the open source tools I have written at