It is very interesting to study the obfuscation techniques used by the attackers in malicious PDF docs. As of my previous blog entry, one of the simplest, yet interesting obfuscation technique used is the cascading filtering. This basically means that the malicious JavaScript code is embedded below the multiple layers of encoded stream.
In this particular sample that I was analyzing, the malicious js was encoded or obfuscated with 4 stream filters (ASCIIHexDecode, LZWDecode, ASCII85Decode, RunLengthDecode, FlateDecode).
Personally, I find that having to do stream extraction and decoding manually can be very a frustrating experience. Luckily though, I stumbled upon pyew, a python-based malware analysis tool, and can be used to deobfuscated heavily obfuscated codes (pun not intended!)
By identifying the offset where the content is located, we can seek through the file with pyew and it will automatically decode theencoded content.