SANS Digital Forensics and Incident Response Blog

When Cases Involve SSNs and Credit Card Data: "Sensitive Data Search and Baseline" Python Script

A key component of any investigation is the type of data exfiltrated. If sensitive data is on a compromised machine, risk is increased significantly. Also, there is a patch work of legislation covering various types of data which is considered sensitive (http://www.reyrey.com/regulations/). In general, social security and credit card numbers are at the top of the concern list. Since many states have encryption exemptions, a forensicator needs to know, does any media storage in the case have sensitive data in the clear?

Data can be encrypted by system administrators/DBAs or by attackers. Attackers usually encrypt data as part of the staging process prior to data exfiltation. Attackers commonly password protected and compressed the data as a .rar file. With strong passwords (32+ character pass-phrases) .rar files can be difficult to almost impossible to open with normal computing power.

Using a cross platform scripting language, Python, a colleague and I created a script to search a mounted drive for clear text social security numbers and credit card data then output it to a file. Future versions will implement not only searching ASCII but hex data too, searching for date of birth, code re-usability (object oriented), and e-mail notification options. Files can be downloaded from here.

The tool can be easily used as a "sensitive data baseline" prior to a machine going into production as a quick check.

If you are interesting in learning Python with a security nexus, SANS has an excellent course called Python for Penetration Testers https://www.sans.org/course/python-for-pen-testers.