Excel Formula/Macro in .xlsb?

Kurt Natvig

Forcepoint Innovation Labs, UK


 

Excel Formula, or XLM – does it ever stop giving pain to researchers?

Last week I received a new sample using the xlsb file format that supposedly contained malicious code. I had a quick look, and wow – this was different. An initial check on VirusTotal (VT) showed that it hadn’t been uploaded to VT yet. So, with nothing to go on, I started looking into the sample.

Structurally, it’s a Microsoft Excel 2007+ document (ZIP) containing the following files:

Excel-Formula-fig1.png

Naturally, we look at the xl/macrosheets/sheet1.bin, right? First we need to enumerate these records. The xl/macrosheets/sheet1.bin looks like this:

Excel-Formula-fig2.png 

How are the records stored? The answer is in Microsoft’s documentation. To establish the recordId, you read the first byte (0x81). Since the high-bit (0x80) is set, this means there is another byte to add to the recordId. Remove this bit for now and we get 0x01. The next byte is 0x01 and as the high-bit (0x80) isn’t set, we can use the value of the byte multiplied with 0x80. This means that the recordId is (1*128)+1 = 129 – which is BrtBeginSheet. To get the length you do the same, read the next byte (0x00) which means there is no high-bit (0x80), so there is no other byte. The rest of the seven bits say 0, so the record has no data.

The next record is BrtWsProp with recordId 147 and length 23.

recordId: (0x93 & 0x7F) + (0x01*0x80) = 147 (0x93)
length: (0x17 & 7x7F) = 23 (0x17)

Now you can parse all the records and get a nice list. Unfortunately, when parsing the records of the xl/macrosheets/sheet1.bin I see nothing unusual.

So we move on to look at the other sheets, what can we find here? Quite a lot actually. The records we are interested in, while we learn, are:

RecordId Name Description
0 BrtRowHdr Tells you what row you currently are on
8 BrtFmlaString Tells you about an embedded string and the pcode (parsed expression) to build this string
11 BrtFmlaError Tells you the pcode (parsed-expression)

 

Let’s have a brief look at the data we need.

 

BrtFmlaString

Microsoft has documented this well in a PDF. To start with, it contains an eight-byte cell information structure, a variable XLWideString (which looks like a Unicode string), two bytes of grbitFlags, and then you get to the formula itself (CellParsedFormula structure).

The first one you’ll find is this:

Excel-Formula-fig3.png 

After decoding, we get this:

RECORD: BrtFmlaString (Id 8,offset 58d), LENGTH: 30
col: 26, row: 20 | strlen=1 : "/"
         1E 2F 00            PtgInt: 47
         41 6F 00            PtgFunc: CHAR (111)

The record has no information about the row, so you need to get this from the BrtRowHdr record. When you get to the CellParsedFormula structure you parse it (as mentioned in my previous article).

 

BrtFmlaError

This record also starts with a eight-byte cell structure, then a one-byte fErr, and two-byte grbitFlags before you reach the formula itself (CellParsedFormula structure).

Excel-Formula-fig4.png 

When you parse the first record of this stream you’ll get:

RECORD: BrtFmlaError (Id 11,offset 1e3), LENGTH: 62
     49 27 00            PtgMemFunc: 27
     19 40 00 01         PtgAttrSpace: 0100
     23 04 00 00 00      PtgName: index 4
     23 14 00 00 00      PtgName: index 20
     0F                  PtgIsect:
     23 5D 00 00 00      PtgName: index 93
     0F                  PtgIsect:
     23 46 00 00 00      PtgName: index 70
     0F                  PtgIsect:
     23 15 00 00 00      PtgName: index 21
     0F                  PtgIsect:
     23 2F 00 00 00      PtgName: index 47
     0F                  PtgIsect:
     13                  PtgUminus:

 

BrtRowHdr

This is a simple structure, but for now we just want the row.

Excel-Formula-fig5.png

The first DWORD gives you the sequence you need (in this case, 2).

 

The result

When you have parsed all these records from all these binary worksheets, you’ll end up with a virtual sheet that looks like this:

Excel-Formula-fig6.png 

This is more informative, but it was a bit of work to get there. At least it is context you can relate to.

As I am writing this article I see that VT has received a copy of the sample, and that when it was first checked (on entry) a single engine was detecting it:

Excel-Formula-fig7.png 

Kudos to Ikarus!

I think my little project is over – when I have a problem like this I can’t let it go until it’s solved, but now I can finally relax!

Get in touch with me if you need help! I think tools should give this kind of context automagically.

Download PDF

twitter.png
fb.png
linkedin.png
hackernews.png
reddit.png

 

Latest articles:

Nexus Android banking botnet – compromising C&C panels and dissecting mobile AppInjects

Aditya Sood & Rohit Bansal provide details of a security vulnerability in the Nexus Android botnet C&C panel that was exploited to compromise the C&C panel in order to gather threat intelligence, and present a model of mobile AppInjects.

Cryptojacking on the fly: TeamTNT using NVIDIA drivers to mine cryptocurrency

TeamTNT is known for attacking insecure and vulnerable Kubernetes deployments in order to infiltrate organizations’ dedicated environments and transform them into attack launchpads. In this article Aditya Sood presents a new module introduced by…

Collector-stealer: a Russian origin credential and information extractor

Collector-stealer, a piece of malware of Russian origin, is heavily used on the Internet to exfiltrate sensitive data from end-user systems and store it in its C&C panels. In this article, researchers Aditya K Sood and Rohit Chaturvedi present a 360…

Fighting Fire with Fire

In 1989, Joe Wells encountered his first virus: Jerusalem. He disassembled the virus, and from that moment onward, was intrigued by the properties of these small pieces of self-replicating code. Joe Wells was an expert on computer viruses, was partly…

Run your malicious VBA macros anywhere!

Kurt Natvig wanted to understand whether it’s possible to recompile VBA macros to another language, which could then easily be ‘run’ on any gateway, thus revealing a sample’s true nature in a safe manner. In this article he explains how he recompiled…


Bulletin Archive

We have placed cookies on your device in order to improve the functionality of this site, as outlined in our cookies policy. However, you may delete and block all cookies from this site and your use of the site will be unaffected. By continuing to browse this site, you are agreeing to Virus Bulletin's use of data as outlined in our privacy policy.