Test process outline
Test case sets
Test results compilation
Detailed test information
Detailed test process description
The anatomy of a test
Test case validation
Feedback and disputes
Product build and configuration policies
Binary classification mapping
Treatment of products with partial feature coverage
Withdrawal from the test (opting out)
Technical issue resolution
VB100 is a certification test for Windows endpoint security solutions.
VB100 seeks to establish whether the static detection layer of the tested product is capable of detecting common, file-based Windows threats, without generating an excessive number of false alarms for legitimate programs.
The test exposes the product to both malicious and legitimate program samples, and records the tested product’s response to these samples.
Exposure to samples happens first by downloading and saving each sample in the local file system in the presence of the product. Then, a scan-on-demand is requested from the product for the successfully saved samples. Finally, any remaining samples are inventoried and their integrity is verified.
Products are expected to demonstrate their protection capabilities by their intervention (or the lack thereof) in this process. Any samples that make it to the end of this process without any errors or changes are considered “not detected”, whereas all other samples are considered as “detected”.
Test case sets are compiled frequently to include freshly acquired samples. We aim to base each new test for the product on the most recent test set available. The same test sets may be used for multiple products if they are tested close together in time.
Test case sets are composed of two subsets:
After completing the test process, the product’s responses are referenced against the respective body of the test cases and are sorted into the following categories:
Rates for each set are determined and verified against the test criteria. A test report is published (regardless of whether the test criteria are met) and the certified state of the product is reviewed.
A product is considered to have passed a test if the product receives any grade better than Grade F.
Grades are determined based primarily on the true positive rates of the Certification Set.
|Grade||True positive rate requirement|
|Grade A+||>= 99.5%|
|Grade A||>= 97%|
|Grade B||>= 90%|
|Grade C||>= 85%|
|Grade D||>= 75%|
The product receives Grade F and fails the test if it:
Certified status can be earned and maintained by products that are tested frequently and successfully meet the test criteria.
The test is performed on virtual machines (VMs), hosted by a Type I (bare metal) hypervisor. VMs are provided with resources approximately equivalent to an average endpoint business PC specification.
The test platform is x64 Windows 10 (edition unspecified), regularly maintained with updates from Microsoft.
Test data collection is performed by a custom test agent application developed by Virus Bulletin. This agent runs on the test VM, interacts with test case sets, triggers responses from the product, and generates detailed log-based evidence. These logs later serve as the basis for generating the test results.
The testing process has two distinct phases, each of which consists of several steps.
Products may intervene in this process in several places. Any of the below may be treated as intervention from the product:
Note that a single test case may trigger multiple responses from the product. In such cases, the test deems the product’s first response the most relevant. In a typical example, the tested product may allow downloading and writing a malicious sample to the file system, but it will block opening of the sample for reading (Access Control). The file will be removed during the subsequent on-demand scan, so the inventory process will record a Removal type of intervention for the sample. As the design considers the first intervention to be the most relevant, the ultimate test case outcome will be the first, Access Control-type intervention.
The product lifecycle in the certification programme begins with an initial product setup, followed by periodic testing. The default testing schedule is quarterly (approximately), unless otherwise agreed by VB and the vendor (within the confines of the general certification criteria).
Periodic tests follow the script below:
These steps are explained in the next chapters.
Products are set up in the test environment when they are first submitted to the test. Installation is performed on a clean Windows image. The product is configured as per its default settings (exceptions apply, see Product build and configuration policies).
A snapshot of this state is captured for use in testing (“baseline snapshot”). This snapshot is maintained throughout the participation of the product in the test.
Shortly before the data collection commences, the following procedure is performed to make sure that the product is ready for testing:
Data collection is generally conducted using an initial, candidate version of the test case set. This candidate may contain samples that we later deem unsuitable for the test. In this stage, VB may remove samples from the set. Data collected for such samples are ignored, as if the sample wasn’t part of the test set at all.
The vendor receives preliminary feedback on the product’s performance. This marks the beginning of the dispute period, during which the vendor can examine the results and file any disputes that they may have. At least one calendar week is provided for the vendor to complete their independent verification of the results.
Feedback includes at least the following:
Upon request, VB will also provide:
Depending on the outcome of the disputes, further samples may be removed from the test case set in this step.
Test results are generated against the final state of the test sets, followed by the publication of the test report and a review of the certified status of the product.
We aim to make test report data representative for the average use case. Accordingly, we attempt to test with product builds that are available for general audiences, and to configure/operate the products as close to the defaults as possible.
We acknowledge that this is not always possible. In such cases, substantial deviations from the average use case – where they might impact the test results – will be documented in the published test report for full disclosure.
Some examples of such deviations:
Please note that VB accepts such proposed changes from the vendors at its discretion. Proposals that would negatively impact the relevance of the report to the reader may be rejected.
Security products often classify threats into various classes, including grey areas like PUAs.
However, the VB100 test relies on binary classification – there is either a hit on a test case, or there is no hit. For such grey area cases, the product vendor is advised to provide VB with configuration and usage instructions that follow the vendor’s view on how these test cases should be classified in the VB100 framework. In lieu of such instructions, test engineers will either follow the defaults or advise the vendor on the recommended settings for best compatibility with the testbed.
The VB100 design assumes that the product has both real-time file system protection and on-demand file system scanning features available. When that assumption is not applicable to the product (e.g. a command-line scanner with no real-time element, a detect-and-report only product), VB will examine whether it is possible to accommodate the product within the confines of this methodology. For products that only offer reporting and no actions to be carried out on the samples, VB will simulate the action based on the product’s own report.
As such circumstances are of interest to the reader of the test report, special setups like the above will be documented in the test report.
A product cannot be withdrawn once data collection has started. Public interest dictates that a test report is to be published, regardless of whether it’s favourable or not from the vendor’s perspective.
However, VB may, at its discretion, allow withdrawal of a product in extraordinary cases, when compelling reasons suggest that the report would bear no relevance to the public. Examples of such situations are: collected data is proven to be tainted by lab-specific technical issues; significant testing errors have occurred, such as deviations from protocol, etc.
VB pledges to work with the vendor to resolve technical issues with the product and notify the vendor as soon as possible when such issues are detected.
Note that if troubleshooting involves sharing the samples used in testing through logs or by other means, VB reserves the right to postpone the test until a new test set (one that includes samples not known to the vendor) becomes available for testing.
Technical issues that occur during the data collection phase generally result in the test data being discarded and the test being rescheduled, to prevent reporting based on incomplete, unreliable or assumptions-based data. Note that technical issues that impact not just the particular test environment but a wider user base (e.g. cloud outage, erroneous signature updates, etc.) at the time of the test do not qualify as a basis for test invalidation and Virus Bulletin may proceed to publishing the report.
Disputes are evaluated on a case-by-case basis. The vendor is asked to provide supporting data or evidence, if any, along with their dispute. Although all efforts will be made to resolve disputed issues to the satisfaction of all parties, VB reserves the right to make the final decision.
To reflect the broad nature of real-life issues, the scope of the disputes is not limited. The following are a few examples:
Prior to the publication of a report, the vendor of the product can choose to provide commentary to be included in the report notes. This is to ensure that the vendor’s perspective receives a fair representation. Such commentaries can be useful when the report contents are disputed, or if the report fails to disclose significant circumstances for the reader of the report. Commentaries are subject to reasonable length limits (must fit the designated page) and editorial approval.
Outside the data collection phase, vendors can request an audit of the product configuration and VM state. This is primarily done through a manual verification of the desired configuration or state (as provided by the vendor), however remote access can also be arranged at a time suitable for both parties. Remote access is supervised by Virus Bulletin personnel.
Version 1.0: First published in this format.