SANS Digital Forensics and Incident Response Blog: Tag - pdf

How to Extract Flash Objects from Malicious PDF Files

Authors of malicious PDF documents have often relied on JavaScript embedded in the PDF file to produce more reliable Adobe Reader exploits. The attackers now also embed Flash programs, which incorporate ActionScript, in a similar manner. This note demonstrates several steps for extracting malicious Flash from PDF files, so you can analyze it for malware artifacts. We will take a brief look at using pdf-parser, PDF Stream Dumper and SWFDump for this purpose. Continue reading How to Extract Flash Objects from Malicious PDF Files

PDF malware analysis

I decided to do some malware analysis as a part of some presentation I had to do. And since I went through the process, I decided to post it here if anyone is interested.

To begin with, I needed to find some malware to analyze. And a great place to find live links to active malware is to visit the site: Malware Domain List.

What I wanted to show was that despite having a fully patched machine with a fully updated AV is not always enough to protect you. One way to do that is to either find a PDF or Flash exploit. The one that I chose for this experiment was this one:

PDF exploit to be used

Application Metadata of Nested Documents

by John McCash

I was drawn to consider someting by a question on a certification practical exam I recently took. The problem had been presented as "find the specified text in the supplied disk image". However the text actually turned out to be viewable in a jpeg file which was nested inside a Word document. Once I'd found the text, the question was essentially answered, but then I started thinking about extraction options and the origins of that JPEG file.

I recalled a tool I'd recently discovered thanks to traffic on the GCFA mailing list, hachoir-subfile. The original email context was about using this tool to extract executable objects from PPS files, but it turns out that it works equally well to extract .jpg files. I had always assumed that when image files were incorporated into MS Office documents, they were somehow re-encoded,