VB ESA - M365 is Virus Bulletin’s continuously running performance test programme for solutions that supplement Microsoft 365’s native security (Exchange Online Protection (EOP)[1]) by adding extra detection layers. Products can participate in the programme either publicly or privately; this methodology documents the full details of the test programme in both cases.
Public VB ESA - M365 tests are designed to quantify the changes in email filtering performance that result from deploying tested solutions alongside Microsoft 365, and to compare performance metrics among the evaluated products.
Outside the public test periods (and for privately tested products), VB seeks to provide participating vendors with continuous feedback about the performance of their product.
The Microsoft 365 comparative base is used to establish a common reference against which the additional detection and filtering capabilities of Microsoft 365 email security add-ons are measured. The comparative base represents email samples that were handled by Microsoft 365 alone and therefore fall outside the scope of evaluation for the tested add-on products.
A sample is included in the comparative base if it meets the following conditions:
Product effectiveness is measured only on samples that are not part of the comparative base.
The samples included in the baseline are excluded from product scoring and do not contribute to detection or filtering metrics for the tested add-ons.
The Microsoft 365 comparative base is a synthetic construct created for the purpose of this test and should not be interpreted as a definitive measure of Microsoft 365 performance in isolation.
The following limitations apply:
Despite these limitations, the comparative base provides a stable and transparent reference that enables meaningful comparison of add-on performance.
The test exposes each tested product to both unwanted and legitimate emails, and records the product’s response to these emails.
VB utilizes a variety of email sources for test cases:
All emails used in the test are in the wild. Virus Bulletin does not create new emails, e.g. to simulate spear-phishing tactics. Some modifications to the in-the-wild emails are necessary to facilitate testing and to protect intellectual property, as detailed later on in this document.
Emails are forwarded to the tested product without undue delay upon receipt by our threat intelligence, in order to stay as close to real time as possible.
The legitimate emails used are predominantly written in English, whereas unwanted emails represent a wide variety of languages.
Note that full solutions and complementary solutions may be subjected to a slightly different mix of emails when there is a potential conflict of interest (for example, if the vendor that supplies the email feed also has its products publicly tested).
Emails are sorted into the following categories and subcategories:
The products’ responses are referenced against the respective body of the test cases and are sorted into the following categories:
Definitions
Incremental Detection Rate (IDR)
The Incremental Detection Rate is the proportion of malicious samples not filtered by Microsoft 365 that are detected by the tested add-on.
The Incremental Detection Rate is calculated as:
IDR = (Number of malicious samples detected by the add-on) ÷ (Total number of malicious samples not filtered by Microsoft 365)
Where:
Residual sample set
The residual sample set is the set of email samples that pass Microsoft 365 filtering and are therefore eligible for evaluation by the tested solutions.
In addition to certification, the test recognises exceptional performance through a series of awards and badges. These distinctions are intended to highlight products that demonstrate outstanding incremental protection beyond Microsoft 365’s native filtering, while maintaining strict control over false positives.
All awards are evaluated exclusively on samples outside the Microsoft 365 comparative base.
Awards:
Badges:
The product lifecycle in VB ESA - M365 begins with an initial product setup, typically done in cooperation with the vendor.
This is followed by continuous testing for the designated testing period. For publicly tested products that join the test on a commercial basis, the designated testing period is approximately one year, during which four shorter periods are designated as official test periods. Data obtained from the official test periods serve as the basis for public, comparative test results and certification.
The test environment uses a Microsoft 365 tenant with one licensed user account on Microsoft 365 Business Basic or Microsoft 365 Business Standard. The assessment focuses on email filtering capabilities; these licences include Exchange Online Protection (EOP)[2] and do not include Defender for Office 365[3].
In order to allow the uninterrupted flow of emails from VB’s server, a connector[4] is configured in the “Exchange admin center” for each of the tested solutions.
Test cases (emails) are sent continuously to the tested products.
Each email is delivered in an individual SMTP transaction. Virus Bulletin seeks to keep modifications to the original, in-the-wild versions of the emails to a minimum, however a number of differences between the original and in-test emails are inevitable. Some of these changes are identical to having a front-end SMTP server that rewrites the recipient information:
The more invasive changes are:
Some metadata about the original email will be retained through a Received: header, again as if the email were received first by a front-end SMTP server. A single new Received: header will be inserted into the email, in the following fashion:
Received: from <original-reverse-dns> (HELO <original-helo-domain> [<originating-ip>]) by <vb-mta> (<vb-mta-software>) with [E]SMTP id <vb-message-id>; <date>
where
A tested product receives test case emails from the VB infrastructure and the product’s response is recorded. These responses are sorted into two categories:
For Integrated Cloud Email Security (ICES) the following rules are applied:
For Secure Email Gateways (SEGs) for Microsoft 365 the following rules apply:
To closely simulate real-time conditions, emails from real-world sources are promptly introduced into the testing environment without unnecessary delay. Despite employing the best threat intelligence and meticulously crafted automation, some of the emails received by the tested products may not be relevant for the test. Therefore, Virus Bulletin regularly reviews and validates test case emails, discarding any that are deemed unsuitable.
Note that in this validation process, many perfectly suitable emails may also be discarded due to the limited capacity for manual review.
As a general rule, feedback is provided to the participating products on a weekly basis. No feedback is given during the official test periods; participants will be given advance warning of any other interruption to feedback (e.g. due to holidays or maintenance periods). The feedback provided is non-comparative by nature, i.e. the feedback by itself is not suitable to determine how a product ranks against other products in the test.
This feedback is for the vendor’s own information only, and sharing of the details publicly either by the vendor or by Virus Bulletin is not permitted.
Feedback includes:
Note that Virus Bulletin may cap the number of emails shared at 400 per email category.
Disputes may be submitted at any time, however for the official test period, Virus Bulletin requires that public test participants submit their disputes within 10 business days upon receipt of the feedback, to ensure timely publication of the public report.
Disputes are evaluated on a case-by-case basis. The vendor is asked to provide supporting data or evidence, if any, along with their dispute. Although all efforts will be made to resolve disputed issues to the satisfaction of all parties, Virus Bulletin reserves the right to make the final decision.
To reflect the broad nature of real-life issues, the scope of the disputes is not limited.
Public test reports can only be representative for the reader if the testing is conducted using publicly available product versions. For this reason, vendors are not permitted to use any enhancement of the product that is not available for general audiences. We encourage (but do not require) vendors to share with the public the configuration used for their product in the test.
Tests are usually conducted with the latest generally available version of a product or service being tested. Deviations from this policy will be documented in the report.
Private tests are not subject to these constraints.
For both public and private product testing, credentials must be provided for a Microsoft 365 account with a Microsoft Business Basic or Microsoft 365 Business Standard licence. This account must have administrator rights and be configured with the specific protection offered by the product being tested.
Email security products often classify emails into various categories.
However, the VB ESA - M365 test relies on binary classification – there is either a hit on a test case (“email not wanted”), or there is no hit (“legitimate”). It is the product vendor’s responsibility to map their own classifications into either category above. In lieu of such mapping from the vendor, the Virus Bulletin test team will endeavour to set up the mapping themselves.
Products participating in the public test cannot be withdrawn from the test once the official test period has started. Public interest dictates that a test report is to be published, regardless of whether or not it is favourable from the vendor’s perspective.
However, Virus Bulletin may, at its discretion, allow withdrawal of a product in extraordinary circumstances, when compelling reasons suggest that its inclusion in the report would bear no relevance to the public. Examples of such situations are: collected data is proven to be tainted by lab-specific technical issues; significant testing errors have occurred, such as deviations from protocol, etc. Note that technical issues that impact not just the particular test environment but a wider user base (e.g. cloud outage, faulty rule updates, etc.) at the time of the test do not qualify as a basis for withdrawal and Virus Bulletin may proceed to publishing the report.
Virus Bulletin pledges to work with the vendor to resolve technical issues with the product and notify the vendor as soon as possible when such issues are detected.
Prior to the publication of a report, the vendor of the product may choose to provide commentary to be included in the report notes. This is to ensure that the vendor’s perspective receives a fair representation. Such commentaries can be useful when the report contents are disputed by the vendor. Commentaries are subject to reasonable length limits and editorial approval.
Vendors of full email security solutions are provided with remote access to their products and may audit their configuration, state, or logs at any time.
The public test series features participants that have signed up and opted to be included in public testing. Virus Bulletin may also choose to include products at its own discretion. For this latter type of products, Virus Bulletin commits to:
Version 1.0