Virus Bulletin :: VBSpam methodology

This methodology version has been superseded by a newer version. Click here for the most recent version in effect.

Overview

Test purpose

VBSpam is a certification and comparative test for email security products, sometimes referred to as spam filters. A product that earns the VBSpam certification can be considered to meet a minimum standard of quality when it comes to the blocking of unwanted or malicious emails, while a product that earns the VBSpam+ certification can be considered to do this particularly well.

Please be advised that while the VBSpam test runs in a lab that mimics a real-life environment relatively well, there may be scenarios that are not fully covered by the test.

Test parts

The VBSpam test is run as a single continuous test. Participating products receive a mix of emails from various feeds. The results are reported (both in the public reports and in privately shared feedback) separately for the different feeds.

Public and private testing

The VBSpam set-up runs continuously throughout the year, with participating vendors receiving weekly feedback on their products’ performance (this feedback is for the vendors' own internal information and is not made public by either Virus Bulletin or the participating vendors). Four 16-day periods are designated as official test periods and, depending on the type of testing agreement in place, products are either tested privately (results are for vendor internal information only) or publicly (results are included in the test report published on Virus Bulletin’s website and certification logos are awarded dependant on the products' performance).

No feedback is given during the official test periods, during holidays, or when maintenance is taking place.

The public test only includes products belonging to vendors who have committed to the public test. A product whose vendor has committed to the public test may not be withdrawn once the testing starts, unless technical problems prevent successful completion of testing.

Testing procedure

General set-up

A participating product (a 'full solution'; see note on partial solutions below) should be able to receive emails through SMTP and deliver filtered emails to a secondary MTA, also through SMTP.

Products can be set up either in the Virus Bulletin lab or in the cloud (‘hosted solutions’). In the former case, the product can be set up on an operating system provided by Virus Bulletin or as a (virtual) appliance. Vendors are given access to their products and it is the vendors' responsibility to set their products up correctly.

Emails

Emails are received from various sources (see below). They are modified to make it appear as if they have been sent directly to a user on the vbspamtest.com domain, but are otherwise left unchanged.

Each email is delivered to the product through an individual SMTP transaction. The transaction is made from a fixed IP address: 10.1.70.41 for locally hosted products, and 79.99.68.210 for hosted solutions.

A single Received header is added at the top of every email:

Received: from xxx.xxx.xxx (HELO yyy.yyy.yyy [12.34.56.78]) by mx.vbspamtest.com 
(qpsmtpd/0.81) with [E]SMTP id dddddddddddddddddddd.ppp; Tue, 12 May 2019 18:20:24 +0100

where the date is the current date and time, 12.34.56.78 is the IP address of the sending MTA, yyy.yyy.yyy is the domain in the HELO/EHLO command, and xxx.xxx.xxx is its reverse DNS record (defaults to Unknown if no reverse DNS record is found). The string dddddddddddddddddddd.ppp is used by the back-end MTA to uniquely identify both the email and the product.

The HELO/EHLO domain of the SMTP envelope will be changed to reflect delivery from a local MTA; the MAIL FROM and RCPT TO values will be the same as those of the original message.

Filtered emails should be returned to localbackend.vbspamtest.com and/or localsidedoor.vbspamtest.com (for locally hosted products) or backend.vbspamtest.com and/or sidedoor.vbspamtest.com (for hosted solutions).

Classification

The test assumes products mark each email as ‘ham’ or ‘spam’. Participating vendors are asked to provide details as to how Virus Bulletin’s mail servers are to determine how an email has been marked. Most typically, this is done by adding specific headers, or by tagging the email subject.

An email that receives a 5xx response during the SMTP transaction is considered to have been marked as spam.
An email that is not returned to the backend MTA is considered to have been marked as spam.
An email that receives a 4xx response during the SMTP transaction, or where the transaction fails to be made or is interrupted, will be resent, up to six times, in 20-minute intervals.

Partial solutions

A product is considered a ‘partial solution’ if it has access only to part of the emails, for example the sending IP address and/or the domains contained in the email. The performance of partial solutions should not be compared to that of full solutions, and cannot always be compared to other partial solutions either. Virus Bulletin can query partial solutions through DNS and/or through an API.

Email feeds and sources

Virus Bulletin receives emails from three sources:

Abusix
Project Honey Pot
Our own direct feeds

The emails are modified to make it appear as if they were sent directly to Virus Bulletin’s servers and to hide traces of spam traps. (Note: this might break DKIM signatures.) Emails are not modified in any other way.

It is important to note that, for a variety of reasons, far more emails are sent as part of the test than eventually end up in the various corpora. The lack of inclusion of an email in the post-test feedback should not be taken as confirmation that the product correctly classified the email.

Emails are classified into one or more categories. This classification uses a combination of automatic and manual processes. Currently, the following categories exist:

Ham

The ham category consists of legitimate emails, typically written by a single person. Many emails in the ham corpus are sent to email discussion lists. Some of these emails have been re-engineered to make it appear as if they were sent to the vbspamtest.com domain directly. The ham emails are written in many different languages, but the majority are in English.

Newsletters

The newsletters category consists of opt-in, many-to-one emails of various kinds, both commercial and non-commercial. We do not require the opt-in to have been confirmed, but we do ensure that the emails included in the test are related to what was originally subscribed to.

No more than three emails per subscription are included in each feedback.

Spam

The spam category includes emails sent to spam traps, regardless of their content.

Potentially unwanted

Potentially unwanted emails (to be read as ‘merely unwanted’) are emails where the content suggests that, while the recipient may not want it, the same email might be wanted by others. They are part of the spam category but are counted with a weight of 20% in the test.

Phishing

Phishing emails are spam emails containing a link that leads either to malware or an attempt to steal credentials.

Malware

Malware emails are emails with an attachment that is either malware itself or would likely download malware. It is possible that a password (often present in the email) would need to be entered for the malware to be executed.

Speed

For each email sent to products, the ‘speed’ (or ‘delay’) is measured as the time it takes for the email to be returned from the product: the counters starts when the delivery of the email starts and stops when the returned email is accepted. The speed is only relevant for emails from the ham corpus and is used to indicate whether products delay the delivery of emails, which would give them an advantage in spam filtering.

Because small delays don’t make a significant difference to the end-user's perspective, the reports and feedback only include the delays measured at the 10%, 50%, 95% and 98% percentiles, and these are indicated as being less than 30 seconds (green); between 30 seconds and 2 minutes (yellow); between 2 minutes and 10 minutes (orange); and more than 10 minutes (red).

	(green) = up to 30 seconds
	(yellow) = 30 seconds to two minutes
	(orange) = two to ten minutes
	(red) = more than ten minutes

Note that for an email that needed to be re-sent (with a delay of 20 minutes, see above), this delay is added to the measured ‘speed’.

Certification

For each product the ‘weighted false positive rate’ is calculated as follows:

WPF rate = (#false positives + 0.2 * min(#newsletter false positives , 0.2 * #newsletters)) / (#ham + 0.2 * #newsletters)

And the ‘spam catch rate’ is calculated as follows:

SC rate = (#false negatives + 0.2*(# unwanted false negatives))/ (#spam + 0.2* #unwanted)

The final score is calculated as:

Final score = SC - (5 x WFP)

Products earn VBSpam certification if the value of the final score is at least 98 and the ‘delivery speed colours’ at 10% and 50% are green or yellow and that at 95% is green, yellow or orange.

Meanwhile, products that combine a spam catch rate of 99.5% or higher with a lack of false positives, no more than 2.5% false positives among the newsletters and ‘delivery speed colours’ of green at 10% and 50% and green or yellow at 95% and 98% earn a VBSpam+ award.

Feedback

Feedback is provided on a weekly basis to participants outside official test periods and holidays. Feedback can be contested and can also be used to improve products but, because of the former, should not be assumed to be 100% accurate, or provide guarantees for future email classifications. It is not intended to be the equivalent of end-users marking email as ‘spam’ or ‘not spam’.

Feedback consists of the original emails, the transaction logs and, where applicable, the header of the email as it was returned by the product. If a product misses more than 400 emails in a category, only a sample of the missed emails is included.

In case of disputes, especially where they concern possible technical issues, log files of both the product and VB’s own systems may be consulted. Virus Bulletin's decision is final, though.

When the feedback concerns an official test, we require vendors to send any disputes within seven days.

Test funding

The test is paid for by the participating vendors. No other funds are received.

VBSpam methodology - ver.2.0

Overview

Test purpose

Test parts

Public and private testing

Testing procedure

General set-up

Emails

Classification

Partial solutions

Email feeds and sources

Ham

Newsletters

Spam

Potentially unwanted

Phishing

Malware

Speed

Certification

Feedback

Test funding

VBSpam

Latest Report

VBSpam for end-users

VBSpam for vendors

VBSpam methodology

VBSpam test schedule

VBSpam Test Archive

VB Testing

VB100

VBSpam

VBWeb

Consultancy Services