Collector-stealer: a Russian origin credential and information extractor

Aditya K Sood & Rohit Chaturvedi

Advanced Threat Research Center of Excellence, Office of the CTO, F5


 

Introduction

Collector-stealer, a piece of malware of Russian origin, is heavily used on the Internet to exfiltrate sensitive data from end-user systems and store it in its C&C panels. The stolen information is consumed in the underground cyberspace for nefarious purposes. In this article, we present a 360 analysis of the Collector-stealer malware to unearth hidden artifacts covering binary analysis, its working, and the design of associated C&C panels. While carrying out our research on Collector-stealer we found enough indicators to lead us to believe that the malware author is based in Russia. The attacker primarily uses Collector-stealer to target European countries, but it also affects users in other countries such as the USA, China, Cambodia and others. Let’s start the analysis of Collector-stealer to unearth some insights. We will refer to the malicious code as ‘Collector-stealer’ throughout this article.

 

Collector-stealer distribution mechanisms

We began our research on the Collector-stealer malware by looking at how this malicious code has been distributed across the Internet. The Collector-stealer author uses multiple methods to launch infections, which include coercing users to visit phishing portals hosting free game downloads, Windows activation/crack software packages, etc., to trigger drive-by download attacks that install the malware on the fly. Drive-by download attacks can be executed by exploiting vulnerabilities in client software such as browsers or abusing the inherent design of browsers. For Collector-stealer distribution, the phishing emails contained messages that appeared to have come from legitimate authorized entities. With additional efforts, we analysed the domains (or subdomains) contained in the phishing emails to collect more intelligence. In the following sections we discuss a few different examples to explain how the attacker distributes the malware.

 

KMSAuto activation utility bundled with Collector-stealer

KMSAuto (see Figure 1) is a free Windows/Office activation utility. In fact, KMSAuto is an unauthorized piece of software that allows the user to circumvent the integrity of the operating system to install cracked tools and is also categorized as a hack tool or ‘riskware’, allowing end-users to use MS Office products illegally. Generally, hack tools like KMSAuto enable attackers to bundle their malicious code with the cracked utility so that legitimate applications can be used in conjunction with malicious code, a technique known as piggybacking.

Using the piggybacking technique, malicious code rides on the back of the KMSAuto utility and activates the legitimate application illegally, to inject additional malicious payloads. Collector-stealer is distributed in this manner. Upon execution of the utility, KMSAuto activates the application at the front, but in the backend it creates a sub-process that installs Collector-stealer on the victim’s machine and starts to communicate with the C&C server.

Fig1-collector-stealer.pngFigure 1: KMSAuto crack utility.

A number of other riskware utilities are used to serve the same purpose as KMSAuto. The majority of these packages relate to games or Discord VPN, such as GTAV, NewWay ModMenu, COD Manager, etc. The quote ‘Nothing comes for free on the Internet’ is worth mentioning here: if a user downloads any crack utility, software key generator or cracked games/software from an untrusted site, there is high chance that it will lead to the user’s machine being infected with malicious code.

 

Collector-stealer downloading via fake miner web portal

KMSAuto is not the only method through which Collector-stealer is distributed. Phishing web portals mimicking content from legitimate software provider sites are also used to spread the malware. During our research, we discovered that the attacker hosted a phishing web portal that replicated the content from a cryptocurrency software provider portal. The phishing web portal hosted Collector-stealer packages, and when the user visited the web portal, Collector-stealer was downloaded on the fly. The phishing web portal tricked users into believing that they had downloaded legitimate miner software, which was not the case.

Figure 2 shows a comparison between the phishing portal and the legitimate portal.

Fig2-collector-stealer.jpg

Figure 2: Phishing web portal on the left vs legitimate web portal on the right.

The phishing web portal was ethhomework[.]online and the legitimate web portal was hostero[.]eu. The attacker mirrored the content from the hostero[.]eu site and also added socially engineered messages such as offering bitcoin after downloading, which is nothing but bait for users.

Now we have an understanding of the distribution mechanisms deployed by the attackers to spread Collector-stealer. Next, we present a technical analysis.

 

Technical analysis

 

Malware history and objective

Public reports have already been published about this malware. The malware has been active since mid-2020 and is still active in the wild and actively compromising victim machines. Collector-stealer, a stealth stealer written in C++, infects the victim machine to steal valuable information such as stored passwords, cookies, web data and more, from the infected machine. Collector-stealer, as its name suggests, is collectively used by many bad guys to exfiltrate data from across the world.

Let’s start an in-depth analysis of the main program and have a look at the functionalities this malware is equipped with.

 

Portable executable file structure

In this section we will present a basic structure of the Collector-stealer malware using static analysis. Figure 3 shows the portable executable (PE) file structure. Overall, PE file structure analysis helps us to understand the design of an executable. PE format is composed of Common Object File Format (COFF), object code, dynamic link libraries (DLLs), font files, and core dumps in 32-bit and 64-bit versions of Windows operating systems. The PE format is a data structure that provides the information to the Windows loaders required to load the executable code in the memory. This includes dynamic library references for linking, API export and import tables, resource management data, and other structures.

Fig3-collector-stealer.png Figure 3: PE structure of the Collector-stealer malicious code.

Now we have provided a basic overview of Collector-stealer, including the basic details of the PE structure. Let’s analyse the binary for more insights.

 

Import address table and critical functions

It is important to analyse the PE import address table (IAT) to understand the API functions being imported from the system DLLs and used during the execution of the binary. When a PE file is loaded into the system, the Windows loader is required to read the PE structure and load the image directly into the memory. During this process, the loader is also responsible for loading all the DLLs that the executable is going to use into the associated process address space. The mapping of DLLs and related API functions is managed through the import address table, which contains the function pointers that the Windows loader copies while DLLs are loaded. In simple terms, the IAT defines which API functions the executable is going to consume and performs related operations. We extracted the IAT table of the Collector-stealer malware and the most critical API functions are discussed in Table 1 to help to understand how Collector-stealer works in the system.

DLL Critical API functions Basic overview
urlmon.dll URLDownloadToFileA Used to download files from the Internet.
wininet.dll InternetCloseHandle
InternetOpenA
InternetConnectA
HttpSendRequestExA
HttpEndRequestA
InternetWriteFile
HttpOpenRequestA
The library contains modules that help applications to interact with FTP and HTTP protocols to access Internet resources. This library contains Internet-related methods.
crypt32.dll CryptUnprotectData Microsoft Cryptographic library, which implements many of the certificate and cryptographic messaging functions in the CryptoAPI, such as CryptSignMessage.
ADVAPI32.dll RegGetValueA
RegOpenKeyExA
Advanced Application Programming Interface (AdvAPI32) is designed to support security calls and registry manipulation calls.
Shell32.dll SHGetFolderPathA Designed to provide Windows Shell API functions, which are used to open web pages and files.
User32.dll CloseClipboard
FindWindowA
OpenClipboard
keybd_event
GetDesktopWindow
ShowWindow
GetClipboardData
Contains modules that help to implement the Windows USER component, additionally provides functions by which we can simulate user behaviour.
Kernel32.dll OutputDebugStringW
IsDebuggerPresent
CreateFileW
DeleteFileW
LoadLibraryA
TerminateProcess
GetCurrentProcess
FindFirstFileExW
FindNextFileW
GetFileAttributesEx
VirtualProtect
WriteFile
ReadFile
GetStartupInfoW
Kernel32.dll is one of the major libraries in Windows machines. It provides functions for process and thread creation, memory management and more. It’s a user-mode library.
gdiplus.dll GdiplusShutdown
GdipCreateBitmapFromHBITMAP
GdipGetImageEncoders
GdipCreateBitmapFromScan0
GdipSaveImageToStream
GdipGetImageEncodersSize
GdipDisposeImage
GdiplusStartup
Microsoft Graphics Device Interface Library, designed to handle graphics components for images.

Table 1: Import address table.

 

Obfuscation routine

In this section we discuss the encryption and decryption procedures used by Collector-stealer. Listing 1 highlights the pseudocode that Collector-stealer uses to deobfuscate the Windows public directory path address and other obfuscated strings using the key (string) ‘1A’.

Step 1 : Encoded_String : "BB;YKXYBB6[HROIBB"
Step 2 : Deobfuscate_str : ""
Label 1:
Deobfuscate_str =Deobfuscate_str+HEX(String[i])+1A
If i<=Length(Encoded_String) :
i=i+1
Goto Label 1

 Listing 1: Pseudocode.

On successful deobfuscation of the code above and executing in a controlled manner, a number of strings were deobfuscated to obtain the clear text that clarifies how exactly Collector-stealer interacts with various resources in the system. Table 2 shows a number of clear text strings obtained after deobfuscation of the inherent routine used by Collector-stealer.

Encrypted string Decrypted strings
2UMOT*GZG Login data
)UUQOKY Cookies
=KH*GZG Web data
] GRRKZJGZ wallet.dat
HXU]YKX browser
JKYQZUV0x14VTM desktop.png
0x15 0x3AKRKMXGS 0x6 0x2A KYQZUV 0x15ZJGZG /Telegram Desktop/tdata
9ZKGS6GZN SteamPath
95,:=’8+BB<GR\KBB9ZKGS SOFTWARE\\Valve\\Steam
0x14\JL .vdf
YZKGS /con

 Table 2: Decrypted strings from the binary.

Whenever in the code the malware needs to use a string, it takes the encrypted string and passes it into functions to deobfuscate it.

 

On-demand pseudo number generator

To store all collected data into a file Collector-stealer performs additional operations to generate random file names by utilizing the power of Windows APIs to generate pseudo-random numbers, i.e. seed values to create random file names, determining the length of the calculated strings, etc. To create a random string file name, Collector-stealer calls GetSystemtimeAsFileTime, as shown in Figure 4.

Fig4-collector-stealer.png Figure 4: Extracting the date and time from the system.

The GetSystemtimeAsFileTime API is used to obtain the current date and time of the system. The API processes 64-bit values, representing LowDateTime (low-order part of the file time) and HighDateTime (high-order part of the file time) under the FILETIME structure. In general, Collector-stealer validates the condition by verifying the data and time limits to calculate the current time of the system. The calculated date and time value act as seed values to create a random string for generating the file name. Figure 5 shows an algorithm used by Collector-stealer to generate 15-word-long random strings, which are later used as file names.

Fig5-collector-stealer.png

Figure 5: Algorithm to create random string.

Figure 6 shows Collector-stealer creating a new file by calling the CreateFileW [1] function utilizing the same algorithm as highlighted in Figure 5.

Fig6-collector-stealer.png Figure 6: Create randomized named file.

Collector-stealer also performs additional operations to extract the location and attempts to check the current time zone setting using the GetTimeZoneInformation function to check which time zone the victim machine belongs to, along with the computer name by calling the GetComputerName method.

 

Extracting screenshots via keyboard operations

In this section the focus is on analysing the interaction of Collector-stealer with the Windows clipboard and keyboard events to take screenshots. In general, clipboard operations cover the cut and copy functions that Windows provides. The clipboard history reflects the number of copy operations that are present in the clipboard. One of the main functionalities of Collector-stealer is to steal data via screenshot captures and store them in the clipboard. As shown in Figure 7, to capture the screenshot, the malware uses the Keybd_event function from user32.dll and simulates the user pressing ‘Take Screenshot’, bypassing ‘VK_SNAPSHOT’ as a key parameter in the keybd_event function.

Fig7-collector-stealer.png

Figure 7: Simulates screenshot key and stores captured image in clipboard.

Later, Collector-stealer accesses the captured image from the clipboard in bitmap format and converts it into stream (Figure 8).

Fig8-collector-stealer.png

Figure 8: Save captured image to stream.

Fig9-collector-stealer.png

Figure 9: Desktop screenshot hex.

Figure 9 is a byte stream of the captured screenshot, initial bytes ‘FF D8 FF E0’, which is the four-byte signature of JPEG IMAGE. The malware writes a byte stream from memory into a newly created file in ‘\\public\\directory’. After taking a screenshot, Collector-stealer starts to look for other data such as finding Telegram data in the ‘tdata’ folder located at ‘%Appdata%’. The ‘tdata’ folder is the directory in which Telegram Desktop keeps all media-related files, session details, etc.

 

Client-side data access

In this section we will look at how Collector-stealer searches for sensitive information stored in specific directories. After getting basic information, Collector-stealer starts to iterate each folder and file in the location ‘C:\Users\alert-user\AppData\Roaming\’ and looks for file login data, cookie files, web data, and wallet.dat. Let’s understand the type of content stored in these files. The details are discussed below:

  • Login data: Chrome logins are stored in the ‘Login Data’ SQLite database, within the ‘logins’ table.
  • Cookies: Chrome cookies are stored in the ‘Cookies’ SQLite database, within the ‘cookies’ table.
  • Web data: Chrome form history is stored in the ‘Web Data’ SQLite database, within the ‘autofill’ table.
  • Wallet.dat: the Bitcoin client stores private key information in a file named wallet.dat. The wallet.dat file contains private keys, public keys, scripts (which correspond to addresses), key metadata (e.g. labels), and the transactions related to the wallet.

Once Collector-stealer has completed the scanning of the primary directory, it iterates the ‘\\AppData\\Local’ directory and once again looks for file login data, cookie files, web data, and wallet.dat. Collector-stealer also scans all desktop files and looks for files with specific extensions such as .txt, .log, .rdp and others. If any file is found with such an extension, it reads the content of the file and stores the data in a compressed RAR file.

Collector-stealer uses the URLDownloadToFile API to download sqlite3.dll from the C&C server, as shown in Figure 10, and stores it in the public directory.

Fig10-collector-stealer.png Figure 10: Collector-stealer downloads the sqlite3 file.

The majority of modern browsers use SQLite databases to store data locally specific to different websites and applications. The data includes cookies, login data, autofill data, web data, etc. The parsing of an SQLite database requires the SQLite library. Collector-stealer leverages this functionality to obtain data in clear text from the SQLite database stored in the compromised end‑user systems. SQLite’s library allows Collector-stealer to run SQL queries for data extraction. After successfully reading the data, Collector-stealer compresses all the stolen data, including directory structure, into Zip format. Figure 11 shows how Collector-stealer structures the collected data into a single archive file.

Fig11-collector-stealer.png Figure 11: Malware downloads sqlite3 file.

Table 3 represents the complete directory structure compressed as a Zip file and sent by Collector-stealer to the C&C server.

Information.txt Contains basic information such as launch time, username, etc.
desktop.png Desktop screenshot.
/Browser/ Contains collected cookies data, autofill data, stored passwords.
/Files/ Contains collected files from a desktop location whose file types are .txt, .log, .docx, .rdp.
/Telegram/ Contains collected data from the data directory, i.e. Telegram data.
/Steam/ Contains collected data from the steam directory as well as Steam registry entries including active user details.

Table 3: Archived directory insight.

 

Network communication

Let’s understand the network communication, i.e. how this malicious code interacts with the C&C server and transmits the stolen data through the network. After successfully collecting the sensitive data from the system, and before transmitting the data to the C&C server, as shown in Figure 12 Collector-stealer creates a new command line sub-process to check Internet connectivity on infected machines by pinging Cloudflare DNS resolver IP address 1.1.1.1. If the ping request fails, then it deletes the executable file along with all previously collected data and silently exits.

Fig12-collector-stealer.pngFigure 12: Collector-stealer initiates ping request.

Let’s dissect the command: ‘cmd.exe /C ping 1.1.1.1 -n 1 -w 1000 > Nul & Del /f /q \”%s\”’ to understand the complete operation.

  • ‘cmd.exe /C’: run the command and then terminate after receiving the response.
  • ‘ping 1.1.1.1 -n 1 -w 1000’
    • ‘-n 1’: total number of echo request should be 1.
    • ‘-w 1000’: number of milliseconds to wait for the echo reply message corresponding to a given echo request message.
    • ‘1.1.1.1’: Cloudflare DNS server that is publicly available to trigger DNS queries in a fast manner.
    • ‘> nul’: The ‘nul’ data is a special file that discards the data written to it. Basically, it is used to hide the output of the command and avoid the use of the command output by the other process.
    • ‘Del /f /q \\\%s\\\’: the command deletes the referenced file quietly (/q) and forcefully (/f) on the fly to remove its traces on the end-user system.

Upon successful connectivity checks, Collector-stealer sends collected data to the C&C server. Figure 13 shows the malware using HTTP POST requests to transmit stolen data using zip files.

Fig13-collector-stealer.png Figure 13: Collector-stealer exfiltrating data over HTTP channel.

If you see the ‘user-agent’ string, Collector-stealer does not use the standard browser identifier, rather a custom string ‘uploader’. The data exfiltration process utilizes the HTTP communication channel. HTTP POST requests are initiated by Collector-stealer to exfiltrate the data in a compressed format (zipped) to the C&C panel managed by the attacker. The unzipping of the data occurs in the C&C panel itself and stolen data is uploaded in the C&C portal for utilization.

 

Command-and-control panel design

In this section, you will gain an understanding of the design of Collector-stealer’s C&C panel. Collector-stealer C&C uses HTTP protocol for communication purposes. The data extracted from the compromised machine is sent to the C&C server in an archive format over HTTP. The Collector-stealer C&C server accepts the HTTP POST requests to receive data from infected machines. The Collector-stealer C&C panel is developed using PHP using openresty as a web server. Figure 14 shows the C&C login panel.

Fig14-collector-stealer.png

Figure 14: Collector-stealer administrative C&C panel.

Figure 15 shows how attackers keep stolen data in a more structured way.

Fig15-collector-stealer.png

Figure 15: Collector-stealer: stolen logs storage functionality.

 

Inference

Information stealers actively target victims to exfiltrate sensitive information from the compromised systems. In this article we have discussed the analysis of the Russian origin Collector-stealer malware. The data stolen by Collector-stealer is used for nefarious purposes – it may be sold in the underground market, the information may be harnessed for launching additional attacks, etc. Due to its popularity in the wild, some adversarial groups have released cracked versions of Collector-stealer on the Internet and made it freely available. The intelligence presented in this research can be utilized to enhance detection and prevention solutions to combat the risks posed by the Collector-stealer malware.

 

References

[1] https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-createfilew.

 

Appendices

 

Appendix 1: Threat actor profile

As Collector-stealer is bundled with numerous stealing features, it quickly gained popularity on underground forums. Due to its extensive list of services offered, as shown in Figure 16, many forum users were keen to buy this stealer and even some groups attempted to provide cracked versions.

Our research team started to investigate the malware and its author and discovered some interesting facts. Collector-stealer is sold by the ‘Hack_Jopi’ group (Figure 17), and after some more digging we found that the author sells this malware only to Russian users. The author has been active on the forum since October 2018, is still active and periodically releases updates. As per the forum’s updates, the last update version was v1.20 ‘aimed at fixing the collection of Telegram, Discord and Steam sessions’. But we also saw some samples with 1.21 version (malware mentioned version_id in drop file).

To cover maximum audiences for Collector-stealer, the author uses other platforms as such as Telegram, where authors create their own channel and publicize their stealer. In forums, they created a separate account with the name ‘CollectorSoft’. Most of the accounts were created between May and June 2020. So we can conclude that this malware started spreading in mid-2020 and is still active in the wild.

Fig16-collector-stealer.png

Figure 16: Services offered.

Fig17-collector-stealer.pngFigure 17: Forum and Telegram accounts.

 

Appendix 2: Indicators of compromise

A number of malware hashes and list of C&C servers related to Collector-stealer are presented in Listings 2 & 3:

a9e3f9fb9cf5ae8dcbfd139ecadb961a
bd27acd9bc0ba05847dc0d8ea443e437
253ce038dd0e2a30165f24b18aaa34d3
eb8e99e82b6ed97f89292467aa8dc866

Listing 2: Indicators of compromise: malware hashes.

f0537213[.]xsph[.]ru
fata-collector[.]online
f0542175[.]xsph[.]ru
f0538564[.]xsph[.]ru
f0537214[.]xsph[.]ru
f0548561[.]xsph.ru
collector-node[.]us
collector-gate01[.]us
collector-steal[.]ga
a0556434[.]xsph[.]ru //Kpot Panel

Listing 3: Indicators of compromise: C&C domains.

 

Appendix 3: Collector-stealer advertisement

Figure 18 shows an example of an online advertisement for Collector-stealer presented in the underground community. We can clearly see the advertisement is in Russian language.

Fig18-collector-stealer.png

Figure 18: Advertisement on forum.

 

Appendix 4: MITRE ATT&CK TTP mapping

In Table 4 we correlate malware activity with MITRE ATT&CK TTP mapping.

Tactic Identifier Name Description
Collection T1074.001 Data Staged: Local Data Staging Store collected data at
//Public// directory
T1560 Archive Collected Data Archive collected data before sending
T1113 Screen Capture Take desktop screenshot
Discovery T1083 File and Directory Discovery Look for web data/cookies/login/desktop files
T1124 System Time Discovery Gather the system time and/or time zone
T1012 Query Registry Query HKCU for Steam entry
T1082 System Information Discovery Extract OS version
T1016.001 System Network Configuration Discovery: Internet Connection Discovery Ping 1.1.1.1 to check Internet connectivity
Execution T1059.003 Windows Command Shell Ping 1.1.1.1 using cmd.exe
Credential access  T1555 Credentials from Password Stores Steal browser passwords
T1539 Steal Web Session Cookie Steal cookies
Command and control   T1071.001  Application Layer Protocol: Web Protocols Send data to the C&C server via POST requests 
T1105  Ingress Tool Transfer Download sqlite3.dll  
Defence evasion   T1027  Obfuscated Files or Information  Contain obfuscated strings 
T1036.001  Invalid Code Signature  Invalid signed executables 

Table 4: MITRE ATT&CK TTP mapping.

 

Appendix 5: Comparison of multiple stealers

While carrying out research on this malware, we tried to determine if there is any correlation between it and any other malware family. There is a strong opinion that the Collector-stealer author utilizes the same components as those found in the code of Hunter stealer. Hunter stealer is an information stealer whose functionality is almost identical to Collector-stealer. Even the file creation and file drop pattern provided by the two stealers are the same. And most importantly, many buyers of this malware reported that logs go through the seller’s bot. Another possibility is that the author is the same for both malware families. This is not concrete information, but it’s a good starting point for collecting more intelligence for attribution.

Download PDF

twitter.png
fb.png
linkedin.png
hackernews.png
reddit.png

 

Latest articles:

Collector-stealer: a Russian origin credential and information extractor

Collector-stealer, a piece of malware of Russian origin, is heavily used on the Internet to exfiltrate sensitive data from end-user systems and store it in its C&C panels. In this article, researchers Aditya K Sood and Rohit Chaturvedi present a 360…

Fighting Fire with Fire

In 1989, Joe Wells encountered his first virus: Jerusalem. He disassembled the virus, and from that moment onward, was intrigued by the properties of these small pieces of self-replicating code. Joe Wells was an expert on computer viruses, was partly…

Run your malicious VBA macros anywhere!

Kurt Natvig wanted to understand whether it’s possible to recompile VBA macros to another language, which could then easily be ‘run’ on any gateway, thus revealing a sample’s true nature in a safe manner. In this article he explains how he recompiled…

Dissecting the design and vulnerabilities in AZORult C&C panels

Aditya K Sood looks at the command-and-control (C&C) design of the AZORult malware, discussing his team's findings related to the C&C design and some security issues they identified during the research.

Excel Formula/Macro in .xlsb?

Excel Formula, or XLM – does it ever stop giving pain to researchers? Kurt Natvig takes us through his analysis of a new sample using the xlsb file format.


Bulletin Archive

We have placed cookies on your device in order to improve the functionality of this site, as outlined in our cookies policy. However, you may delete and block all cookies from this site and your use of the site will be unaffected. By continuing to browse this site, you are agreeing to Virus Bulletin's use of data as outlined in our privacy policy.