Greetz from academe: Monkey vs. Python

2013-11-04

John Aycock

University of Calgary, Canada
Editor: Helen Martin

Abstract

Python obfuscation is relatively rare. In the latest of his ‘Greetz from academe’ series, highlighting some of the work going on in academic circles, John Aycock takes a look at a research paper in which the authors reverse engineered a 'hardened' Python application from Dropbox.


Some programming languages have an embarrassment of riches when it comes to code obfuscation. For JavaScript, of course, every few months sees a fresh analysis of malicious code, such as Peter Ferrie’s recent breakdown of JS/Proslikefan [1]. For C, code obfuscation is sport, with the International Obfuscated C Code Contest [2]. And Perl is… Perl is another programming language.

Python, however, has largely eluded the obfuscation craze. There are a few examples, including a beautiful Mandelbrot set generator whose code is shaped like a Mandelbrot set [3]; another post by the same author [4] contains links to some other scattered Python obfuscation examples, and there was a 2011 PyCon talk on the subject [5]. In the unlikely event that the bad guys ever decided to forsake JavaScript for Python, these few examples could turn out to be Useful Information.

All this means that when I see anything relating to Python obfuscation, it quickly gets my attention. That was the case with a paper from the 2013 USENIX Workshop on Offensive Technologies, called ‘Looking inside the (Drop) box’ [6], in which the authors detail their techniques for reverse engineering a ‘hardened’ Python application from Dropbox. It’s a paper that wouldn’t be out of place in the pages of VB and, much to my surprise, it turns out that I (very indirectly) helped with the work.

In the Dropbox case, the Python obfuscation was not at source level, but in the ‘frozen’ version that was shipped out. A frozen Python application is one where all the pieces of compiled Python bytecode are bundled together to allow a single file to be distributed. It’s essentially a form of (non malicious) packing, and a number of legitimate tools/scripts exist for this purpose – one even in the Python source distribution itself.

Dropbox’s frozen executable was modified to make reversing it more challenging, though [6]. The opcode values were altered, the code was encrypted, and the normal means to query bytecode were removed, amongst other things. The researchers ended up injecting a DLL into the Dropbox process to gain control, allowing them eventually to inject their own Python code into the hardened interpreter. A few steps later (all of which are detailed in the paper), they had acquired the Python bytecode.

Once the bytecode had been extracted, the authors used a tool called uncompyle2 to reconstitute the Python source code. Upon further examination [7], I discovered that the tool is based on a Python compilation framework I created, and a Python decompiler that I cobbled together in around 1999. It’s a small world, and it’s reassuring, as an academic, to know that occasionally something useful comes of my work.

Back to the reverse engineering: after the Python source code had been extracted, the researchers worked around Dropbox’s authentication and gathered up SSL data, using a technique they called ‘monkey patching’.

I must confess that I had never heard that term before, and it brought to mind either a roomful of monkeys with typewriters working on Shakespeare v2.0, or animals prone to flinging their own faeces. In neither case did it cast the reverse engineering in a terribly flattering light. Naturally, I turned to the arbiter of all that is true, Wikipedia, which helpfully informed me [8] – and I am not making this up – that the technique ‘has also been termed duck punching and shaking the bag’. The emphasis is theirs, believe me. So while monkey patching sounds bad, the alternatives are even more ghastly. But I digress.

Apparently, monkey patching is simply poking into a dynamic language at run time and modifying things. This allowed the paper’s authors to hook all the SSL objects in the Python code and dump out their data unencrypted. And thus Dropbox fell.

No monkeys, pythons, or ducks were harmed in the creation of this article.

Bibliography

[1] Ferrie, P. Fans like pro, too. Virus Bulletin, September 2013. http://www.virusbtn.com/vba/2013/09/vb201309-Proslikefan.

[2] The International Obfuscated C Code Contest. http://www.ioccc.org/.

[3] Preshing, J. High-resolution Mandelbrot in obfuscated Python. http://preshing.com/20110926/high-resolution-mandelbrot-in-obfuscated-python/.

[4] Preshing, J. Penrose tiling in obfuscated Python. http://preshing.com/20110822/penrose-tiling-in-obfuscated-python/.

[6] Kholia, D.; Wegrzyn, P. Looking inside the (Drop) box. 7th USENIX Workshop on Offensive Technologies, 2013.

[7] uncompyle2/README. Uncompyle2 0.13, 22 February 2012. Source available on GitHub.

twitter.png
fb.png
linkedin.png
hackernews.png
reddit.png

 

Latest articles:

Fighting Fire with Fire

In 1989, Joe Wells encountered his first virus: Jerusalem. He disassembled the virus, and from that moment onward, was intrigued by the properties of these small pieces of self-replicating code. Joe Wells was an expert on computer viruses, was partly…

Run your malicious VBA macros anywhere!

Kurt Natvig wanted to understand whether it’s possible to recompile VBA macros to another language, which could then easily be ‘run’ on any gateway, thus revealing a sample’s true nature in a safe manner. In this article he explains how he recompiled…

Dissecting the design and vulnerabilities in AZORult C&C panels

Aditya K Sood looks at the command-and-control (C&C) design of the AZORult malware, discussing his team's findings related to the C&C design and some security issues they identified during the research.

Excel Formula/Macro in .xlsb?

Excel Formula, or XLM – does it ever stop giving pain to researchers? Kurt Natvig takes us through his analysis of a new sample using the xlsb file format.

Decompiling Excel Formula (XF) 4.0 malware

Office malware has been around for a long time, but until recently Excel Formula (XF) 4.0 was not something researcher Kurt Natvig was very familiar with. In this article Kurt allows us to learn with him as he takes a deeper look at XF 4.0.


Bulletin Archive

We have placed cookies on your device in order to improve the functionality of this site, as outlined in our cookies policy. However, you may delete and block all cookies from this site and your use of the site will be unaffected. By continuing to browse this site, you are agreeing to Virus Bulletin's use of data as outlined in our privacy policy.