SPUTR: a proposal for the uniform naming of spammer and phisher content tricks

2006-08-01

John Graham-Cumming

Independent consultant, France
Editor: Helen Martin

Abstract

John Graham-Cumming thinks it's time for an information-rich naming scheme that can be used to refer to spammer and phisher content tricks.


Introduction

I have been tracking the tricks used by spammers in the bodies of their messages since January 2003. Three years on, I have collected 55 distinct tricks and published them on The Spammers' Compendium website [1]. When I first started publishing the site I gave each of the tricks a humorous name (such as 'Camouflage' or 'Honey, I shrunk the font'), and some of these names have entered popular use (such as 'Hypertextus Interruptus', which is enshrined in the SpamAssassin test INTERRUPTUS).

Tricks in the wild

The trick count has been growing steadily over the last three years: Figure 1 shows the number of tricks in The Spammers' Compendium by calendar quarter. It is interesting to note that trick innovation or discovery seems to slow down in the fourth quarter of each year – perhaps indicating that spammers are in the middle of spamming their Christmas campaigns at that time, and not spending time on modifying their software.

Trick count by calendar quarter.

Figure 1. Trick count by calendar quarter.

Entries are made in The Spammers' Compendium when the tricks have been identified by me in spam seen in the wild in my spam traps, or in spam emailed to me by volunteers. Submitters receive credit in The Spammers' Compendium for submitting a new trick.

While the humorous names make good copy for journalists writing about the latest devious spammer trickery, they are less useful to people working in anti-spam research because they do not, in themselves, convey much information. In this article (and the related blog post [2]) I propose a drier, but more information-rich, naming scheme that can be used to refer to spammer and phisher content tricks.

Time for a naming scheme

At the 2004 Virus Bulletin conference I presented a paper (see [3]) in which I analysed some trends in the use of spammers’ tricks by examining the appearance of various tricks (as extracted from The Spammers' Compendium) against a large corpus of spam supplied by Sophos. One of the problems in that analysis was that I was forced to write code to identify the tricks in The Spammers' Compendium and I also had to explain each trick as the names conveyed little information.

To remedy that situation and provide a foundation on which other authors and vendors can build research into spammer trickery I think it's time for a uniform naming scheme for these tricks.

In the uniform naming scheme, which I am calling the Spam/Phish Uniform Trick Repository, or SPUTR, each name consists of three '!'-separated parts: a purpose, a name, and a technology. The purpose is the reason for the trick (for example, the trick is used to obscure a URL, or to insert innocent words). The name is derived from the current Spammers' Compendium pejorative name. The technology identifies the way in which the trick is coded (for example, with HTML or MIME).

Table 1 contains a list of proposed 'purposes' that can be used to categorize tricks.

BWOBad word obfuscationMaking it hard for a filter to parse potentially bad words (e.g. Viagra)
GWGood word insertionAdding words likely to confuse a statistical filter.
HBHash bustingInserting randomness designed to make message hashing hard.
TATokenization avoidancePreventing a filter from tokenizing a message.
UHURL hidingHiding a URL so that a user is fooled into clicking an incorrect link.
UOURL obfuscationMaking it hard for a filter to identify a URL and check it against a black list.
WBWeb bugsInserting a beacon that tells the spammer that a message has been read.

Table 1. Trick purposes

For a single name there could be multiple tricks using different technologies (e.g. some tricks might be implemented using HTML or CSS), or tricks intended for different purposes (words might be inserted to fool a Bayesian filter or break a hash).

Table 2 shows the 'technologies' that would be recognized in the naming scheme:

CSSUse of CSS
HTMLAny HTML without using CSS
JavascriptUse of Javascript for trickery
MIMEManipulation of MIME
PlainPlain text

Table 2. Technology identifiers.

For example, the original Invisible Ink trick, written using HTML, would be referred to as:

GWI!Invisible!HTML

while a CSS variant would be:

GWI!Invisible!CSS

Names would be generated only for tricks that have been seen in the wild.

With such uniform naming it would be possible to analyse spams and phishes (perhaps even specific recognizers for each trick could be written) and the trends built up over time to see how individual tricks and individual classes of tricks are changing.

Table 3 shows the proposed mapping from the current Spammers' Compendium names to the SPUTR name.

The Big PictureTA!BigPicture!HTML
Invisible InkGWI!Invisible!HTML and GWI!Invisible!CSS
The Daily NewsGWI!BigTag!HTML
Hypertextus InterruptusBWO!Interruptus!HTML
Slice and DiceTA!SliceNDice!HTML
MIME is moneyGWI!PlainNotHTML!MIME
Lost in SpaceBWO!Space!Plain
EnigmaUO!Enigma!HTML
Script writerTA!Script!Javascript
Ze Foreign AccentBWO!Accent!Plain
Speaking in TonguesHB!Tongues!Plain
The Black HoleBWO!BlackHole!HTML
A Numbers GameBWO!Numbers!HTML
Bogus LoginUO!BogusLogin!HTML
Honey, I Shrunk the FontGWI!ShrunkFont!HTML
No Whitespace, No CryTA!NoWhitespace!Plain
Honorary TitleGWI!Title!HTML
CamouflageGWI!Camouflage!HTML
And in the Right CornerHB!RightCorner!Plain
A Form of DesperationGWI!Form!HTML and BWO!Form!HTML
It's Mini Marquee!GWI!Marquee!HTML
You've Been FramedBWO!Framed!HTML
Control FreakTA!ControlFreak!Plain
Don't Cramp My StyleGWI!Style!CSS
The MicrodotBWO!Microdot!CSS
WYSI_not_WYGUH!WYSINotWYG!Javascript
UltraSee Enigma
Internet ExploiterUH!InternetExploiter!HTML
Style Wars: Episode 1Included in other tricks
The tURLing TestUO!TurlingTest!Plain
Flex HexBWO!FlexHex!CSS
Sound of SilenceWB!Silence!HTML
Blankety BlankBWO!BlanketyBlank!HTML
Doing the SplitsBWO!Splits!Plain
But is it Art?BWO!ASCIIArt!Plain
Absolute ZeroSame as Control Freak
Spell BreakerBWO!Splelnig!Plain
About FaceBWO!AboutFace!HTML
Catch a WaveTA!Wave!HTML
Treasure MapUH!TreasureMap!HTML
You Cannot be SeriousUO!Mcenroe!HTML
The MatrixTA!Matrix!Plain
Sticky FingersBWO!StickyFingers!Plain
Flotation DeviceTA!Floatation!CSS
The Small PictureTA!SmallPicture!HTML
ChopGUI TA!ChopGUI!HTML/HB!ChopGUI!HTML
Big Header-ed?
The RakeBWO!TheRake!CSS
Now you see it; now you don'tBWO!Copperfield!CSS
Slick Click TrickUH!Caption!HTML
Whiter Shade of PaleTA!Pale!HTML

Table 3. Trick name mapping.

Cooperation

If the anti-spam and anti-phish community gets together now it may be able to avoid the mess that exists in the anti-virus industry where vendors compete to release information about viruses and each have their own way of naming them.

Worse, the current unifying malware scheme maintained by MITRE (the Common Malware Enumeration or CME; see http://cme.mitre.org/) unifies virus names by providing a simple identifier for each that contains absolutely no information. For example, the Kukudro.C worm is currently assigned the uninformative name 'CME136'.

In order to help the anti-spam and anti-phish community I propose to:

  1. Maintain a website containing the uniform naming scheme and keep it updated as new spammer tricks are reported to me;

  2. Allow any organization to use the names freely and identify themselves as a user by including their name or logo on an appropriate page on the site without any form of compensation;

  3. Accept reports of new spammer and phisher trickery for inclusion on the website;

  4. Host a mailing list for all interested parties so that tricks can be discussed and named;

  5. Manage an open source project that creates software that can analyse an RFC822 message and output the tricks used.

In order to do that I would like the support of at least five major email security companies in the form of a decision to use the SPUTR names in their own research and publications.

Undoubtedly there will be many things about this proposal that old anti-virus hands, and those fighting email security problems would like to modify or comment on; please send your comments to

Bibliography

[1] The Spammers' Compendium. http://www.jgc.org/tsc/.

[2] Graham-Cumming J. Proposed uniform naming scheme for spammer/phisher content trickery. http://www.jgc.org/blog/2006/06/proposed-uniform-naming-scheme-for.html.

[3] Graham-Cumming J. The Waxing and Waning of Spammers' Trickery. Proceedings of the Virus Bulletin International Conference, 2004. http://www.virusbtn.com/conference/vb2004/abstracts/jgrahamcumming.xml.

twitter.png
fb.png
linkedin.png
googleplus.png
reddit.png

 

Latest articles:

Throwback Thursday: CARO: a personal view

As a founding member of CARO (Computer Antivirus Research Organization), Fridrik Skulason was well placed, in August 1994, to shed some light on what might have seemed something of an elitist organisation, and to explain CARO's activities and…

VB2016 paper: Uncovering the secrets of malvertising

Malicious advertising, a.k.a. malvertising, has evolved tremendously over the past few years to take a central place in some of today’s largest web-based attacks. It is by far the tool of choice for attackers to reach the masses but also to target…

VB2016 paper: Building a local passive DNS capability for malware incident response

Many security operations teams struggle with obtaining useful passive DNS data post security breach, and existing well-known external passive DNS collections lack complete visibility to aid analysts in conducting incident response and malware…

Throwback Thursday: Tools of the DDoS Trade

In September 2000, Aleksander Czarnowski took a look at the DDoS tools of the day.

VB2016 paper: Debugging and monitoring malware network activities with Haka

Malware analysts have an arsenal of tools with which to reverse engineer malware but lack the means to monitor, debug and control malicious network traffic. This VB2016 paper proposes the use of Haka, an open source security-oriented language, to…