SANS Digital Forensics and Incident Response Blog

SANS Digital Forensics and Incident Response Blog

Timeline analysis with Apache Spark and Python

This blog post introduces a technique for timeline analysis that mixes a bit of data science and domain-specific knowledge (file-systems, DFIR).

Analyzing CSV formatted timelines by loading them with Excel or any other spreadsheet application can be inefficient, even impossible at times. It all depends on the size of the timelines and how many different timelines or systems we are analyzing.

Looking at timelines that are gigabytes in size or trying to correlate data between 10 different system's timelines does not scale well with traditional tools.

One way to approach this problem is to leverage some of the open source data analysis tools that are available today. Apache Spark is a fast and general engine for big data processing. PySpark is its Python API, which in combination with Matplotlib, Pandas and NumPY, will allow you to drill down and analyze large amounts of data using SQL-syntax statements. This can come in handy for things like filtering, combining


Cloak Your Incident Investigation with Confidentiality

Summary: When an enterprise investigates a data security incident, it is often wise to involve legal counsel early. Counsel may be able to ensure the details of the investigation are kept confidential by law.

Infosec Law and Politics Are Dangerous.

The law and politics surrounding data security are highly adversarial. Legal and political adversaries have incentive to prove that an enterprise like a corporation or a government agency made a mistake (e.g., suffered a breach).

Plaintiff lawyers these days make a lot of money suing enterprises for breaches of patient or customer data.

And, politicians like state attorneys general attract a lot of media attention by hollering at local companies or healthcare entities that have lost personal data.

There is nothing inherently wrong with lawyers bringing lawsuits or politicians complaining in the media.

But an enterprise does not want


Hindering Exploitation by Analysing Process Launches

Malware can do some nasty things to your system, but it needs to get on there first. Thankfully, users have become more suspicious of files named FunnyJokes.doc.exe and so malware authors have had to become more innovative, using a mix of social engineering and the constant stream of 0-day browser exploits to land evil code on your box.

Popular infection methods include leveraging exploit kits to run arbitrary code in the context of your browser and 'infecting' documents files, such as Microsoft Word documents, which still, 20 years since the first macro virus, allow you to automate the downloading and execution of files. Recently I have been pondering the similarities between various attack types and how they present themselves on the end users machine. It strikes me that, more often than not, the endgame is launching a process by the targeted program.

This begs the question:

Is there ever a legitimate reason for Internet


Device Profiling With Windows Prefetch

It wasn't that long ago that every report I read containing Windows prefetch artifacts included only the basics: executable name, first and last time executed (now eight timestamps in Win8), and number of executions. There is much more information stored in prefetch files, but until recently there were few tools toeasily parse and provide it to the examiner. Mark McKinnon wrote one of the first prefetch parsers to include full path names for additional files accessed within the first ten seconds of application launch. TZWorks' pf tool now also provides this information.Depending on case type, this information could be overkill, but imagine a prefetch file tracking execution of a malicious binary while also identifying a related malicious DLL loaded, or the location of


A Threat Intelligence Script for Qualitative Analysis of Passwords Artifacts

The Verizon Data Breach Report has consistently said, over the years, passwords are a big part of breach compromises. Dr. Lori Cranor, and her team, at CMU has done extensive research on how to choose the best password policies verses usability. In addition, Alison Nixon's research describes techniques to determine valid password of an organization you are not a part of ("Vetting Leaks Finding the Truth when the Adversary Lies"). What about passwords leaked in the organization you are defending? This post will be about such a scenario.

According to former Deputy Director, of The Center for The Studies of Intelligence, Ms. Carmen Medina says "analysis in essence is putting things correctly into categories" "insight is when you come