Forensic Challenge 2010/6 – Analyzing Malicious Portable Destructive Files is now live

Another challenge is ready to be tackled by forensic analysts, students, hackers and alike. This time, we present you with an attack vector that has become quite successful: malicious PDF files!

For challenge 6 of our series (provided by Mahmud Ab Rahman and Ahmad Azizan Idris from the Malaysia Honeynet Project Chapter) we present you with a pcap file that contains network traffic generated by the following scenario: An unsuspecting user opens a compromised web page, which redirects the user’s web browser to a URL of a malicious PDF file. As the PDF plug-in of the browser opens the PDF, the unpatched version of Adobe Acrobat Reader is exploited and, as a result, downloads and silently installs malware on the user’s machine.

We prepared a set of questions that requires you to dive deep into the portable document format. Submit your solution by November 30th 2010. The top three submissions will receive small prizes.


Christian Seifert
Chief Communications Officer
The Honeynet Project

Source: The Honeynet Project’s Blog

Converting String, Hex and Fixnum Using Ruby

1.0 Introduction

Software development in the security domains always involve converting from and to hex and binary format. For those new to certain languages, a high learning curve is involved and this translates to increasing the development cost.

This article concentrates in using the ruby language to help new comers shorten the learning curve.

To help us understand this article, we will use the below data:

str = “ABC

Please note that there is a new line character (\n) between C and D. Table 1.0 contains the same variable presented in 4 different formats.

String(Binary) A B C \n D Hex 41 42 43 0A 44 Fixnum(Decimal) 65 66 67 10 68 Binary 01000001 01000010 01000011 00001010 01000100

Table 1.0 Data Presentation Comparison

When we store ‘A’ character into a variable, it needs to be placed in memory. Since our RAM can only store 1 and 0, the ‘A’ character needs to be converted to this binary format. Base on ASCII table ( it is agreed that the ‘A’ character should have 01000001 which is equal to 65 in decimal.

From the ASCII table, a new line character will be stored as 00001010 in RAM which is equal to 10(decimal).

Now we are ready for the next phase which is to convert the data into ruby language.

2.1 Converting Hex To BinaryString

First we will look at how to convert hex to binary.

sHex = “4142430A44”
puts [sHex].pack(‘H*’)     ==> “ABC\nD”

pack() is a method  for array object.  Originally sHex is a string, so we need to put it in the block to convert it to array.

Pack method will produce a BinaryString. The ‘H*’ directive will tell ruby that the array element is a Hex string.

There are many directives available (

2.2 Converting BinaryString To Hex

For converting BinaryString to hex, we should use unpack with H* as the format parameter.

str = “ABC\nD”
str.unpack(‘H*’)  ==> [“4142430a44”]

unpack() will return array which contains a string of the hex format in its first element. To get the  string of hex you can try

str.unpack(‘H*’)[0]    ==> “4142430a44”

2.3 Converting BinaryString To Binary

unpack() can also be used to present data in binary. Use B* as the format parameter as below

str.unpack(‘B*’)[0]  => “0100000101000010010000110000101001000100”

The result is quite long. To understand it, split the string so that each group has 8 numbers. This is because each character consumes 8 bit in memory.

01000001 01000010 01000011 00001010 01000100 A B C \n D

2.4 Converting Hex To Binary

“41”.class is a String. This means our memory will store “00110100” (decimal =52, hex = 34) and “00110001”( decimal = 49, hex = 31).

“41”.hex.class is a FixNum. “41”.hex will tell ruby to read those string as hex, as a result stores “01000001” (decimal = 65, hex = 41) in memory. The two examples will definitely be interpreted differently by a CPU.

To display the same value in binary we can use to_s(2) method from the Fixnum class.

“41”.hex.to_s(2)  ==> “01000001”

The result is a string which contains a binary representative of 0x41.

Value 2 for the parameter means to display the value in base 2. Sending 16 as base will output the same result “41”, as hex is base 16. You can try to pass any integer between 2 and 36 and study the output for further exercise.

2.5 Converting Binary To BinaryString

pack(‘B*’) method from Array class will process the first element of the array and present it in BinaryString.

[“0100000101000010010000110000101001000100”].pack(‘B*’) ==>  “ABC\nD”

3.0 Conclusion

Always remember, that machines do store information in streams of 0 and 1. Since human have limitations in memorizing  long numbers, hex representation is used which can still represent the same value.

Different from both above, string is a stream of character human use for storing information. ASCII table is used to convert information stored in a computer to a format that human can understand

4.0 Reference

3- … ml#M000689

ProFTPD 1.3.3c Compromise:Trojan Source Code

On Sunday, the 28th of November 2010 around 20:00 UTC the main distribution server of the ProFTPD project was compromised.  The attackers most likely used an unpatched security issue in the FTP daemon to gain access to the server and used their privileges to replace the source files for ProFTPD 1.3.3c with a version which contained a backdoor.

The fact that the server acted as the main FTP site for the ProFTPD project (, as well as the rsync distribution server ( for all ProFTPD mirror servers means that anyone who downloaded ProFTPD 1.3.3c from one of the official mirrors from 2010-11-28 to 2010-12-02 will most likely be affected by the problem.

The attacker did not touch the repositories, instead he managed to change the .gz and .bz2 file to include the altered source code that will enable him to:

  1. track down  the computer used to compile the source code
  2. plant a backdoor that will enable him to escalate to root privileges on the proftpd installed server.

The first attempt done was changing the configure file and adding the tests.c file in the tests directory. The attacker added 4 lines in the configure file as shown below.

The first step to be done before compiling the source code is to run ./configure. When a user runs ./configure, the tests.c file found in the tests directory which is a file added by the attacker will be compiled without the user’s consent. This will then produce an executable file name tests. The code in line 3 will then run tests to send information to a server at 212.x.y.z

Figure 2 is a snippet from tests.c showing that the program will connect to 212.x.y.z on port 9090 and send the string “GET /AB HTTP/1.0\r\n\r\n” which is a Get method for the HTTP protocol. This will tell the attacker which IP is using the compromised source code and might be a potential victim.

For the second payload, the attacker has altered the help.c file in the src directory.

The above line is added at line 129 in the pr_help_add_response function. This line of code will be executed when ftp client sends HELP ACIDBITCHEZ to the vulnerable proftpd even without user authentication. It will then run /bin/sh or /sbin/sh and give the shell to the attacker with root privileges. Figure 3 shows the vulnerable ftpd giving root privileges to an unauthenticated user.

Users are strongly advised to make sure they are not using the compromised program. Below are the md5sum of the source tarball for version 1.3.3c.

8571bd78874b557e98480ed48e2df1d2  proftpd-1.3.3c.tar.bz2
4f2c554d6273b8145095837913ba9e5d  proftpd-1.3.3c.tar.gz

IDA Pro: IDC Script for Decrypting VB Obfuscated Malware

I was playing with a piece of malware with Jun Yee and we came across an obfuscated string in the VB code. The malware itself was written in Microsoft Visual Basic 6. It has a feature that allows the malware to overwrite itself after execution just to make it a bit stealthier. Additionally, the virus itself contains an obfuscated string . Thanks to Jun Yi for helping me decrypt it faster.

Binary Hash: A2904D4E6527278C94EAC1FB2B665572

Antaramuka Pengaturcaraan Aplikasi untuk VirusTotal

Virustotal telah menjadi salah sebuah tempat rujukan yang sangat berguna dalam memastikan sesebuah fail itu berbahaya atau tidak. Jika dilihat dari sisi hadapan, virustotal telah mengumpulkan antivirus-antivirus yang terkenal sebagai enjin untuk memberitahu tentang status sesebuah fail yang ingin dikesan. Ini ketara keberkesanannya dari sudut keutuhan sesebuah keputusan, yang mana, rujukan silang (cross-reference) diantara kesemua antivirus-antivirus dapat dilihat di dalam virustotal, seterusnya dapat mengurangkan kadar kesilapan dalam proses pengesanan.

Untuk menambahbaik lagi skop kemampuan dan fungsi virustotal, virustotal telah ditambah dengan fungsi antaramuka pengaturcaraan aplikasi virustotal, atau VirusTotal API. Dengan menggunakan Virustotal API, kita boleh memuatnaik dan mengesan fail serta URL, atau mengakses laporan fail yang telah dimuatnaik sebelum ini, tanpa melalui laman web utama virustotal. Ia boleh dilakukan dengan melaksanakan permintaan HTTP Post ke URL tertentu di virustotal. Untuk maklumat lanjut mengenai gerak kerja VirusTotal API serta cara-cara perlaksanaannya, sila klik disini.

Saya juga tidak terkecuali dalam memanfaatkan kegunaan VirusTotal API dalam beberapa projek yang saya jalankan. Berikut adalah implimentasi ringkas menggunakan bahasa pengaturcaraan ruby untuk mengakses laporan yang sudah sedia ada di pengkalan data Virustotal berdasarkan MD5 hash yang disediakan oleh kita ;

Detecting Virtualized Environment in Gnu/Linux

As sysadmin, it is hard to tell if you’re  in physical or virtualized environment 😉

Below are some command line available to detect whether we’re in virtualized environment or not :

user@server1:~$ dmesg | grep -i vmware
[    0.000000] ACPI: SRAT 0000000041ef07f6 00080 (v02 VMWARE MEMPLUG  06040000 VMW  00000001)
[    1.470135] ata1.00: ATAPI: VMware Virtual IDE CDROM Drive, 00000001, max UDMA/33
[    1.510687] scsi 0:0:0:0: CD-ROM            NECVMWar VMware IDE CDR00 1.00 PQ: 0 ANSI: 5
[    3.420736] scsi 2:0:0:0: Direct-Access     VMware   Virtual disk     1.0  PQ: 0 ANSI: 2
[    3.421765] scsi 2:0:1:0: Direct-Access     VMware   Virtual disk     1.0  PQ: 0 ANSI: 2

or :

user@server1:~$ dmesg | grep -i virtual
[    0.290665] Booting paravirtualized kernel on bare hardware
[    1.268543] input: Macintosh mouse button emulation as /devices/virtual/input/input2
[    1.470135] ata1.00: ATAPI: VMware Virtual IDE CDROM Drive, 00000001, max UDMA/33
[    3.420736] scsi 2:0:0:0: Direct-Access     VMware   Virtual disk     1.0  PQ: 0 ANSI: 2
[    3.421765] scsi 2:0:1:0: Direct-Access     VMware   Virtual disk     1.0  PQ: 0 ANSI: 2

or :

user@server1:~$ dmidecode | egrep -i ‘manufacturer|product’
Manufacturer: VMware, Inc.
Product Name: VMware Virtual Platform
Manufacturer: Intel Corporation
Product Name: 440BX Desktop Reference Platform
Manufacturer: No Enclosure
Manufacturer: GenuineIntel
Manufacturer: GenuineIntel

or :

user@server1:~$ dmidecode | egrep -i ‘vmware|virtual’
Manufacturer: VMware, Inc.
Product Name: VMware Virtual Platform
Serial Number: VMware-56 4d a7 a1 10 59 2a e7-76 16 97 8a 38 5d 6e 1c
VME (Virtual mode extension)
VME (Virtual mode extension)
Description: VMware SVGA II
String 2: Welcome to the Virtual Machine


user@server1:~$ cat /proc/scsi/scsi
Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
Vendor: NECVMWar Model: VMware IDE CDR00 Rev: 1.00
Type:   CD-ROM                           ANSI  SCSI revision: 05
Host: scsi2 Channel: 00 Id: 00 Lun: 00
Vendor: VMware   Model: Virtual disk     Rev: 1.0
Type:   Direct-Access                    ANSI  SCSI revision: 02
Host: scsi2 Channel: 00 Id: 01 Lun: 00
Vendor: VMware   Model: Virtual disk     Rev: 1.0
Type:   Direct-Access                    ANSI  SCSI revision: 02

(Yet Another) Quick Botnet Analysis

Botnets are network of malware-infected machines that are controlled by an adversary. Our approach to in studying this botnet is to perform active analysis by using an actual malware sample, infecting the machine and observe its activities.

As we probe deeper into the network traffic collected by Wireshark, we find very detailed IRC functionality, attack commands, and capabilities. We learn that the ‘scan//” refers to a specific Denial of Service (DoS) attack function. Other notable strings include the IRC channel names like  “#dpi”, “#!” and “#Ma” which may identify the IRC channel in which infected computers are summoned. Similarly the string “HTTP SET” is instructing the infected system to download the malicious executable file from the remote web server (other malware distribution site).

As displayed above, we have to make some modifications on our irc client to join the botnet with the password “laorosr”. The modification is rather simple, loaded the IRC client with OllyDbg and modified the string “NICK” to “NCIK” which allowed our IRC client joined the botnets without any error messages.

Update for Gallus Nov 3, 2010

Here are some of the major changes in the recent Gallus:

  • Improved extraction of malform PDF object structure
  • Added CAPTCHA functionality within sample submission
  • Integrate virustotal API as ‘two-factor verification’ of sample analysis
  • Added support for Adobe LibTIFF exploit analysis and detection

If you happen to come across with error/bugs while using Gallus, feel free to shoot us an e-mail at honeynet (at)

No endstream, no endobj, no worries

In analyzing malicious PDF documents, being able to understand the format of its object structure is definitely useful. In order to look for malicious content inside the file, we might need to go through some of the process that’ll include interpreting the PDF object structure. The PDF object is enclosed with “obj” and “endobj”. Between the “obj” and “endobj” there are  usually  2 components, object dictionary and stream. Object dictionary are represented by keys and values that enclosed with “<<” and “>>”, while stream is a sequence of bytes. A stream shall consist of zero or more bytes bracketed between the keywords stream (followed by newline) and endstream.

The below snippet reflects the normal PDF object structure;

obj 1 0
<< /Length 12 >>

The obj 1 0 contains the dictionary (in between << and >>) of /Length (key) with value of 12. Below the dictionary, the stream exist with string “HELLO WORLD!” just before the endstream.  Finally, thehe object structure is closed  with endobj tag which indicate the end of object 1 0’s portion.

Although the PDF object structure is rather easy to understand, these structure can also be easily manipulated in many ways for malicious intent. The main reason of manipulation purpose is to break the analysis process particularly for PDF analysis tools.
How can the PDF object structure be manipulated? Usually attackers omit some syntax or tags  required within the object. This omission, however, seems to be considered as valid structure by PDF reader such as Adobe Reader. For example:

Object without “endobj”
obj 1 0
<< /Length 1337 >>

Object without “endstream”
obj 1 0
<< /Length 1337 >>

So-called bluff trick
obj 1 0
<< /Length 1337 >>
HELLO WORLD!endstream\n

In the 3 examples above, we can see that even when some components are dropped (or added) from/to the structure and  the PDF reader can still render the text without generating any error.

In the last snippet,  we can see the use the bluff trick to confuse the security tools in getting the right portion of stream. When pattern matching technique is used, the script/tool might not get the complete stream content since it got confused between the first and the second endstream. A proper handling of these manipulation should be considered thoroughly in order to get a reliable extraction.

Generalizing the security tools seems to be a crucial task in order for it to work in any conditions encountered. Pattern matching technique alone will not work. Understanding the format within the PDF object helps a lot in the process of generalizing the analysis tools.

For example, in a normal manipulation method, attackers cannot get rid of the “endstream” and “endobj”‘s tag simultaneously. Instead, either “endstream” or “endobj” or both will exist. From our rough solution, a regular expression like />>.*?stream(.*?)(endstream|endobj)/m can be reliably implemented with aid of other filtering mechanism.

New features added to MyKotakPasir 2

A lot of improvements has been added in the last 2 months including security fixes, producing better report output and making the back end analysis engine more stable.

The following are the list of updates:

  • Antivirus scanning results now being taken care by  VirusTotal
  • Import Address Table Hook result
  • Hex Dump output can be downloaded from the report.
  • More details in General Information section.
  • File submission now protected by Captcha to prevent evilness
  • Now MyKotakPasir has been made available to the public

Additional cool features will be available soon!

Feel free to have a go  at it here