Greetz from academe: counting Jedis


John Aycock

University of Calgary, Canada
Editor: Helen Martin


John Aycock considers Internet censuses and a tool that can scan almost the entire IPv4 address space in search of the answer to a given census question in less than 45 minutes.

Jedi Knights are a force to be reckoned with, and there are data to back that up. Censuses in the UK [1] and Canada [2] as well as other regions of the Empire [3] have tens of thousands of people declaring their religion to be ‘Jedi’. This must seem like a devastating blow to Pastafarians everywhere, of course, but it just goes to show that you never know how many of something you’ll find until you start counting them.

The same principle applies in security: how many machines have an open ssh port? How many Windows XP installations still linger on? How many vulnerable instances of some particular server exist? These are not academic questions, and have a very practical relevance; they are excellent bar trivia questions for VB conferences, and they also happen to be precious intelligence for anyone planning a large-scale attack. The answers to these and many other questions can be settled the Jedi way, by taking a census of the Internet.

In the wake of Code Red, an excellent paper (which is still worth a read today) appeared in the 2002 USENIX Security Symposium, entitled ‘How to 0wn the Internet in Your Spare Time’ [4]. Its authors posited that a worm could be built that would infect all vulnerable targets on the Internet in ‘tens of seconds’, so long as a list of these vulnerable targets was compiled in advance – a census, if you will. Speaking of IPv4 at the time, they said ‘it would take roughly two hours to scan the entire address space… Such a brute-force scan would be easily within the resources of a nation state bent on cyberwarfare.’ A thought experiment, but an interesting one.

In 2013, it turns out that any attacker can be a nation state. The latest USENIX Security Symposium has a paper about a tool called ZMap [5]. This is no thought experiment. ZMap can scan almost the entire IPv4 address space in search of the answer to a given census question in less than 45 minutes. And by ‘almost’ I mean 98%, so there’s hardly a need for a qualifier at all. As the authors of the paper point out, a defensive strategy that depends on attackers not finding an IPv4 device on the Internet is rather unwise.

The idea of scanning the whole Internet for vulnerabilities, and the ability to do so may seem like old hat. Perhaps the most (in)famous recent example was the ‘Internet Census 2012’ performed by the anonymous author of the Carna botnet [6], which commandeered vulnerable devices to scan and collect data. Yet still the bar is set relatively high, because not everyone would be able to build such an infrastructure.

That sound you hear is the bar dropping. ZMap is open source and publicly available. It runs in user space on Linux and gets high scan rates (a 1300x improvement over nmap speeds, according to the authors’ data) from a single, not very impressive machine with a gigabit Ethernet connection. Scanning the IPv4 space is well within reach of script kiddies.

There is some impressive engineering behind ZMap’s implementation. Probes are sent via raw sockets, bypassing the overhead of the TCP/IP stack by crafting Ethernet packets directly and reusing parts of the packets where possible. No state is maintained, and instead what amounts to a pseudo-random sequence of IPv4 addresses is used to keep track of what has been scanned, and what has yet to be scanned. (This is actually the permutation scanning idea from [4], but the ZMap paper doesn’t cite it.) No probe retransmission occurs, but the potential data loss from this optimization was measured and found to be negligible.

Dabbling in dual-use technology is unavoidable in some areas of academic research, and one might even say ‘it’s a trap!’ The ZMap authors are clearly aware of the potential their tool has for misuse, and explicitly note that in the paper. Out of curiosity, I searched the paper for ‘ethic’ and got one hit: ‘We worked with senior colleagues and our local network administrators to consider the ethical implications of high-speed Internet-wide scanning and to develop a series of guidelines to identify and reduce any risks’. Happily, the reader is not burdened with any details of the ethical argumentation, nor the ethics of doing the work in the first place or of releasing ZMap to the public. (A similar search for ‘legal’ encourages scanners to ‘comply with any special legal requirements in their jurisdictions’ along with a mention of the legal threats received from disgruntled scannees.)

It may not beat making the Kessel Run in under 12 parsecs, but ZMap can indeed find the droids you’re looking for.


[4] Staniford, S.; Paxson, V.; Weaver, N. How to 0wn the Internet in Your Spare Time. Proceedings of the 11th USENIX Security Symposium, 2002.

[5] Durumeric, Z.; Wustrow, W.; Halderman, J. A. ZMap: Fast Internet-wide Scanning and Its Security Applications. Proceedings of the 22nd USENIX Security Symposium, 2013.



Latest articles:

Fighting Fire with Fire

In 1989, Joe Wells encountered his first virus: Jerusalem. He disassembled the virus, and from that moment onward, was intrigued by the properties of these small pieces of self-replicating code. Joe Wells was an expert on computer viruses, was partly…

Run your malicious VBA macros anywhere!

Kurt Natvig wanted to understand whether it’s possible to recompile VBA macros to another language, which could then easily be ‘run’ on any gateway, thus revealing a sample’s true nature in a safe manner. In this article he explains how he recompiled…

Dissecting the design and vulnerabilities in AZORult C&C panels

Aditya K Sood looks at the command-and-control (C&C) design of the AZORult malware, discussing his team's findings related to the C&C design and some security issues they identified during the research.

Excel Formula/Macro in .xlsb?

Excel Formula, or XLM – does it ever stop giving pain to researchers? Kurt Natvig takes us through his analysis of a new sample using the xlsb file format.

Decompiling Excel Formula (XF) 4.0 malware

Office malware has been around for a long time, but until recently Excel Formula (XF) 4.0 was not something researcher Kurt Natvig was very familiar with. In this article Kurt allows us to learn with him as he takes a deeper look at XF 4.0.

Bulletin Archive

We have placed cookies on your device in order to improve the functionality of this site, as outlined in our cookies policy. However, you may delete and block all cookies from this site and your use of the site will be unaffected. By continuing to browse this site, you are agreeing to Virus Bulletin's use of data as outlined in our privacy policy.