Wednesday 14 October 15:00 - 15:30, Green room
Yuki Umemura (Kobe University)
Lumma Stealer (LummaC2) is one of the most prominent information stealers in the current landscape. However, recent builds combine more than ten obfuscation techniques – MBA-obfuscated string decryptors, a second-stage inner encoding, indirect-jump anti-disassembly, zeroed switch tables, and large-scale control-flow flattening (CFF) – that tend to break a normal static analysis workflow before it can reach configuration data, strings, or the true control flow. Google/Mandiant have reported CFF recovery on an older Lumma build, but to our knowledge, no public work has described the full protection stack of the January 2026 build end-to-end without heavy symbolic execution or a custom decompiler. This talk presents a practical, reproducible workflow that pushes static deobfuscation to its practical limit on this sample.
For control-flow recovery, we develop a three-stage pipeline: dispatcher neutralization, flow reconnection, and conditional-branch restoration. Against approximately 467 CFF dispatchers (including ~70 split-instruction variants) and 288 FF 25 indirect jumps, this workflow recovers 274 decompilable functions (87%) and about 6,100 lines of pseudocode from regions Hex-Rays previously refused to decompile. We further discover that the four apparent CFF clusters are in fact parts of a single ~102 KB cross-cluster function. We also show that the CFF has evolved since the older build: dispatch tables moved from .data-resident encoded offsets to stack-based runtime computation, and switch tables were zeroed to destroy branch-target information, rendering the prior resolution approach ineffective on this build.
For string and data recovery, we process 610 call sites across 460 decrypt wrappers with zero extraction failures and zero decryption failures, covering 11 observed MBA decryptor variants. The recovered dataset contains 128 UTF-16LE strings, 73 UTF-8 strings, eight GUIDs, three shellcode buffers, and 396 binary entries. A second-pass decoder recovers 51 additional strings and 22 binary entries by modelling an inner separator-based encoding. We further show that the observed MBA wrappers collapse to a small family of XOR±ADD/SUB-style byte-wise transforms and, based on that understanding, we implement a standalone Unicorn-based decryptor (~600 lines, no IDA, no manual parameters). This lightweight tool matches 606/610 entries recovered by the IDA implementation (99.3%), demonstrating how thorough manual analysis leads to reusable, generalized tooling.
The combined deobfuscation reveals key attack-chain components hidden behind these protections. CFF-protected functions handle exfiltration path construction, COM/WMI initialization, and batched encrypted-string setup. The decrypted data reveals Chrome/Firefox credential theft, crypto-wallet extraction via browser extensions, a Steam-profile dead-drop resolver for C2 retrieval, WMI fingerprinting, Cloudflare-bypass cookies, and COM-based persistence. We also observe that the dominant Type 1 accounts for 490/610 entries, while high-value strings such as PowerShell commands, Firefox key-database paths, HTTP headers, and geo-blocking logic are each assigned rare variants, suggesting deliberate protection tiering by sensitivity.
The contribution is not merely the recovered artifacts, but a transferable analyst methodology: byte patching, pattern-driven parameter recovery, algebraic normalization, emulation-based generalization, and systematic failure analysis. We also define the practical ceiling of this workflow: the remaining 13% of CFF-region functions fail because fragmenting a single large protected function violates Hex-Rays' function-local assumptions about ownership, termination, and cross-function jumps. All IDA Python scripts, the standalone decryptor, and the decoded datasets are released as open source.
|
Yuki Umemura Yuki is a Ph.D. student at Kobe University. His research focuses on malware analysis, reverse engineering, and practical deobfuscation methods for modern malware. He is particularly interested in understanding obfuscation techniques and developing analysis workflows and tools to make reverse engineering more effective. |
Back to VB2026 conference page
Register your interest for VB2026