SANS Digital Forensics and Incident Response Blog

How to Extract Flash Objects from Malicious PDF Files

Authors of malicious PDF documents have often relied on JavaScript embedded in the PDF file to produce more reliable Adobe Reader exploits. The attackers now also embed Flash programs, which incorporate ActionScript, in a similar manner. This note demonstrates several steps for extracting malicious Flash from PDF files, so you can analyze it for malware artifacts.

Starting to Examine the Malicious PDF File

To get a general sense for how to analyze with malicious PDF files, take a look at my Analyzing Malicious Documents Cheat Sheet. From the tools perspective, Didier Steven's pdf-parser is an all-time favorite. Another excellent tool, which sports a user-friendly GUI, is PDF Stream Dumper by "Dave".

The steps to locate and extract JavaScript from PDF files using these tools have been documented fairly well. Fortunately, the same tools can help us locate and extract embeded Flash programs.

Malicious PDF Sample

For our sample, we'll use the malicious PDF file "The Obama Administration and the Middle East.pdf" that was documented on Contagio Malware Dump. The file was sent to its targets as an attachment to an email message that looked like this:

The file name of the attachment varied. The screenshot is, again, courtesy of Contagio Malware Dump.

PDF Stream Dumper to Locate and Extract Flash Programs

We can use PDF Stream Dumper to examine the structure and contents of the malicious PDF file. Its Search_For menu allows us to quickly locate risky PDF objects, including Flash:

The tool shows that object 2 contains an embedded Flash program:

To extract the Flash program, right click on the object that contains it — that's item #2 in the left column-and select Save Decompressed Stream. You should now be able to examine the Flash program the way you would treat any other malicious Flash file (more on that below).

Pdf-parser to Locate and Extract Flash Programs

Another program that can help you locate malicious Flash objects in a PDF file is pdf-parser. For instance, you may be able to locate the object that stores the Flash program by running "pdf-parser -search flash":

You can extract the object's contents by using "pdf-parser -object 2 -raw > flash.swf". Because pdf-parser inserts additional information in the beginning of its output, you'll need to use your favorite editor to remove all text lines leading up to the start of the Flash code, which in this case begins with the letters "CWS".

You can use pdf-parser on both Linux and Windows, as long as Python is installed. For this example, I'm using REMnux, which is my Linux distribution that includes common malware analysis tools.

Analyzing the Malicious Flash Program

A number of tools can examine contents of a Flash program and extract embedded ActionScript. For instance, SWFDump, part of the free SWFTools distribution, can do the trick if you call it using "swfdump -Ddu":

SWFDump disassembles any ActionScript it locates within the Flash program. In our example, the code implements heap-spraying, presumably to transfer control to the attacker's code once the Flash vulnerability is exploited:

A promising tool for Flash analysis that might some day offer an alternative to SWFDump is SWFREtools, released by Sebastian Porst. Unfortunately, the development of this tool seems to have stalled. Another very promising tool for examining SWF files is SWF Investigator by Adobe.

Examining this code is beyond what I'd like to cover in this posting. Fortunately, Hermes Bojaxhi documented his analysis steps of the same (or very similar) sample. Take a look there if this topic interests you.

We still have much to learn for dealing with Flash programs in PDF files. If you can recommend additional tools or techniques, please leave a comment.

Lenny Zeltser focuses on safeguarding customers' IT operations at NCR Corporation. He also teaches how to analyze malware at SANS Institute. Lenny is active on Twitter and writes a security blog.


Posted May 5, 2011 at 7:25 AM | Permalink | Reply


nice tutorial..btw how to extract an object from ms office document such as flash,raw data or others..

Posted November 29, 2013 at 10:20 AM | Permalink | Reply


I tried to download the PDF Stream dumper but Symantec antivirus deleted it. virustotal check reports several issues with that file:

Posted December 2, 2013 at 2:36 PM | Permalink | Reply

Lenny Zeltser

Indeed antivirus sometimes tools trigger on PDF Stream Dumper because some of the components it uses can be considered undesirable on most systems. As with any tool of this nature, I suggest you run it only in an isolated lab.