VB2014 paper: How they’re getting the data out of your network: a survey of methods used for exfiltration of sensitive data, recommendations for detection and protection

2016-01-01

Eric Koeppen

IBM, USA
Editor: Martijn Grooten

Abstract

Exfiltration of data has been a feature of many attacks, where confidential customer information has been leaked to malicious actors - such infections can have disastrous effects on a company’s brand, customer loyalty, and competitive advantage. In his VB2014 paper, Eric Koeppen examines some examples of malicious data exfiltration and explores some methods for detecting and mitigating against such threats.


Abstract

When malware infects a system, often it is only the first step in a chain of events. Once on a system, malware can move laterally through a network, infecting other systems, and searching for important data. If the malware finds data for which it has been programmed to search, or an attacker is using the malware to poke around opportunistically, it can then send copies of that data to external servers in a process known as exfiltration.

We have seen exfiltration used in many attacks where confidential customer information has been leaked to malicious actors. Such infections can have disastrous effects on a company’s brand, customer loyalty, and competitive advantage.

Starting in 2006, Operation ShadyRAT targeted 72 different companies over a period of five years, exfiltrating massive amounts of information. In the famous Target breach of 2013, 40 million credit and debit card accounts were stolen, along with PII data on another 70 million customers. Target said its data breach has cost $240 million so far, with further litigation threatening to push that cost still higher.

This paper will examine some examples of malicious data exfiltration and explore methods for detecting and mitigating against such threats.

Introduction

As malware authors become more creative, and digital defences become more complex, there are bound to be some casualties of the information security arms race. No longer content with the thrill of wreaking havoc on a system, today’s hackers want to see a return on the investment of time and energy they put into infiltrating systems. Gaining a foothold in a system is only the first step in a chain of events that constitute a modern compromise. Once malware has established itself on a system, it can move laterally throughout a network, infecting system after system, and searching for important data which can then be exfiltrated (i.e. sent to external servers).

Exfiltration has been used in many recent attacks where sensitive information has been leaked to malicious actors. This sensitive information may be proprietary data that drives a company’s business, or private customer data, including financial information. Such infections can have disastrous effects on a company’s brand, customer loyalty, and competitive advantage. Not only can a company lose money through the direct loss of revenue, but the leaking of customer information can harm consumer confidence in the company in the long term, it can lead to costly litigation, and the leaking of trade secrets can compromise the company’s position in the marketplace.

In the rest of this paper, we will look at some examples of malware performing data exfiltration, discuss methods for identifying these types of data breaches, and explore methods for mitigation of data exfiltration.

Exfiltration scenarios

Advanced persistent threat (ShadyRAT) exfiltration

Starting in 2006, Operation ShadyRAT targeted 72 different companies over a period of five years, exfiltrating massive amounts of information. When it was reported by McAfee [1], it was one of the largest advanced persistent threat (APT) campaigns ever made public. ShadyRAT leveraged various infection mechanisms to gain footholds in affected systems, often using targeted spear phishing emails to trick users into opening infected attachments. Once on a system, the malware connected to a remote server for command and control (C&C) instructions and exfiltration. Dell researcher Joe Stewart linked Operation ShadyRAT definitively to the Chinese threat group Comment Crew [2] (a.k.a. APT1 [3]).

ShadyRAT was particularly novel in the way it dealt with C&C communications, often hiding commands in images via steganography [4] and mathematically encoding commands into the image data. One of the C&C commands the remote server can issue on a compromised system is the instruction to connect to another remote server on a specified port. When this connection is made, a malicious user on the remote server to which the infected system connects is given a small command-line shell to navigate through the infected system. This shell allows the remote user to browse through the infected system, looking for any interesting data, which it can then transmit to the remote server. Unfortunately, the format of the data being exfiltrated, and even what data is considered ‘interesting’ for exfiltration, is nearly impossible to predict. Filtering for data leaving the system must thus be based on system-specific guidelines. The initial connection between the infected system and the remote controller involves a predefined handshake:

/*\n@***@*@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@>>>>\*\n\r”

It should be possible for most intrusion prevention systems (IPS) to filter on this string. However, if the string is prone to false positives, then another strategy, such as the use of access control lists for the machines that contain sensitive information, would be preferable.

Point of sale malware exfiltration

In the Target case that came to light in 2013, 40 million credit and debit card accounts were stolen [5], along with PII data on another 70 million customers. Target has reported that the data breach has cost the company $240 million so far [6], with further litigation threatening to push that figure higher. The Target attack is representative of point of sale (POS) malware. POS malware typically infects a system via a trojan. Once it has gained a foothold in the system, POS malware typically performs memory scraping on all running processes in search of credit card track data. Credit card track 1 and 2 data (separated by a ‘;’) appears in the following format:

%B1234567890987654^LastName/FirstName^1407777000000001000000003000000?; 1234567890987654=1407777000000300001?

This breaks down as follows:

1234567890987654 16-digit credit card number
LastName/FirstName First and last name of card holder
1407 Expiration date in the format YY/MM (year/month)
777Service code
000000001000000003000000Track 1 discretionary data (may include CVC1)
000000300001Track 2 discretionary data (may include CVC1)

When POS malware finds data formatted like this (or a subset of the above) in process memory, it sends it through one of various transport protocols to a remote server. Occasionally, POS malware will perform a checksum on suspected credit card data using the Luhn algorithm [7] to determine whether it is a valid credit card [8].

Often, the data is reformatted for transmission. Commonly, track 2 data (i.e. the data following the ‘;’ in the example above) will be stripped of the ‘=’ sign and sent out. Certain variants of POS malware do encrypt the data for transmission, which makes credit card data exfiltration very difficult to detect via content filtering. The transport methods used by a subset of POS malware observed in the past year have included HTTP Posts, HTTP Gets, FTP and SMB/NetBIOS. The Target POS malware connects to a specific UNC path and copies data to a shared directory [9]. IPS signatures based on searching for valid credit card numbers will often catch the credit card data being exfiltrated as long as the data is not encrypted. If a local intrusion detection mechanism can detect attempted reads across process memory, this will help with identifying memory scraping. If memory protection can be put in place that prevents access to memory across the various processes, this should prevent the memory scraping, which would prevent the credit card data being gathered in the first place.

Financial malware exfiltration

Variants of the Zeus banking trojan have terrorized the banking industry for years, accounting for losses of hundreds of millions of dollars. The Gameover Zeus variant alone has accounted for over $100 million in theft since 2011 [10]. There are many different variants of the Zeus banking trojan and they use various methods for exfiltrating banking information.

Variants of the Zeus trojan have been known to check the Internet Explorer browsing history and address bar for a list of known banking websites. If any such sites are found, Zeus prepares a mock-up of the site which is then served to the victim when they try to browse to it. The mocked up web pages are actually just a ruse to harvest login credentials for those sites. Once gathered, the login credentials are encrypted, stored in a data file, and then transmitted to a remote server via HTTP Post [11]. In this case, by the time the data is exfiltrated, it’s very late in the process. Stopping the infection before it gets to that point is no doubt preferable. In order to catch the outbound traffic, filtering on known servers (typically serving ‘coolstar.php’) can be very helpful.

Beyond just checking for web history, Zeus variants have been known to ransack systems for other sensitive data. The search for sensitive data either to steal or to make additional use of includes the following:

  • Parsing cookie files for local data-containing files

  • Stealing digital certificates

  • Stealing local private keys

  • Stealing FTP client information

  • Parsing registry keys for valuable information

  • Stealing settings for mail clients like Windows Live Mail and Outlook.

Once the above data has been harvested, it is saved in a file which is then encrypted and sent back to a remote server. As above, it is important to monitor the system locally and catch the malware before the encryption occurs.

Zeus variants have been very successful with keylogging methods and have even created graphical keyloggers that capture screenshots in the case of online banking sites that use graphical or virtual keyboards instead of traditional keyboard-based input methods. Often this data is encrypted before exfiltration, so network-based detection of the data itself becomes extremely difficult. Use of process monitoring and memory protection to detect and stop the malicious processes that gather the information becomes key. Since content filtering on encrypted data is a serious issue, it is all the more important to specify and lock down the servers with which machines containing sensitive data are allowed to communicate. In such a case, it is important to have a blacklist of known exfiltration servers.

Recently, Zeus variants have started to use P2P proxy bots [12]. These proxy bots gather stolen data from other bots in the botnet for exfiltration. Payload messages are all hashed, signed, and then encrypted with RC4 encryption. Detecting the presence of the P2P botnet on the network is key, and can be done by tuning network monitors to detect the P2P keep-alive messages. Keep-alive messages typically have the fourth byte set to 0x00 as a version request type [12]. If a peer fails to answer version request type messages within five tries, then the malware will try to connect to www.google.com or www.bing.com to verify that it has Internet access. Detecting these messages in amongst legitimate HTTP traffic is not always feasible.

Conclusion

There is an arms race going on between those who wish to protect sensitive data and those who wish to gain unfettered access to it for their own purposes. Malware authors are becoming extremely creative, not only with their infection methods, but also with the methods they use for the exfiltration of sensitive data. This paper has described some widely used data exfiltration scenarios, and has suggested methods for detection and protection where possible. We saw with advanced persistent threat (ShadyRAT) exfiltration, that outbound traffic between the infected host and the remote server can be identified by a predefined handshake token. With the point of sale malware exfiltration, we saw that data can be filtered on if it is not encrypted, and if it is similar to known data formats for credit card track 1 and track 2 data. With financial malware exfiltration, we saw that local detection/prevention is key, as the data is typically encrypted before exfiltration. However, we did also see that internal proxy keep-alive communications can be detected for peer-to-peer implementations. This paper should help enhance the arsenal for those looking to detect and protect data that is leaving a compromised network.

Bibliography

twitter.png
fb.png
linkedin.png
hackernews.png
reddit.png

 

Latest articles:

VB2018 paper: Uncovering the wholesale industry of social media fraud: from botnets to bulk reseller panels

In this paper GoSecure researchers Masarah Paquet-Clouston and Olivier Bilodeau explore an undocumented segment of the social media fraud (SMF) industry: wholesaling, from botnet supply operations to bulk reselling.

VB2018 paper: Now you see it, now you don't: wipers in the wild

There has recently been a trend of APT campaigns including a 'wiper' functionality to destroy data, either as a means to remove evidence or as its core purpose. This paper examines three different classifications of wipers through examples of various…

VB2018 paper: Who wasn’t responsible for Olympic Destroyer

Paul Rascagnères & Warren Mercer present the malware that they have identified – with moderate confidence – as having been used in the attack against the 2018 Winter Olympic Games. They describe the malware’s propagation techniques and its…

VB2018 paper: From drive-by download to drive-by mining: understanding the new paradigm

Jérôme Segura discusses the rise of drive-by cryptocurrency mining, explaining how it works and putting it in the broader context of changes in the cybercrime landscape.

The dark side of WebAssembly

The WebAssembly (Wasm) format rose to prominence recently when it was used for cryptocurrency mining in browsers. This opened a Pandora’s box of potential malicious uses of Wasm. In this paper Aishwarya Lonkar & Siddhesh Chandrayan walk through some…


Bulletin Archive

We have placed cookies on your device in order to improve the functionality of this site, as outlined in our cookies policy. However, you may delete and block all cookies from this site and your use of the site will be unaffected. By continuing to browse this site, you are agreeing to Virus Bulletin's use of data as outlined in our privacy policy.