John Aycock highlights an ACSAC paper that looks at the issue of detecting web content modifications.
Copyright © 2014 Virus Bulletin
From time to time, I visit schools to give outreach talks about computer security. One question I always ask the students is how they know that what they see in their web browser is the same information as was sent from the web server. It’s easy to forget that for many – possibly most – users, the Internet is a magical thing. For regular users, of course the information they receive comes straight from the source; why would they suspect that what they see has been broken apart like countless Lego bricks, strewn across multiple devices and systems, and reconstructed seamlessly for their pleasure?
The reality is different, of course, and even if both the user’s machine and the server(s) providing content are assumed to be uncompromised, there are many points along the way at which content changes can occur. As security professionals, we might be inclined to think first of man in-the-middle attacks, and while that is a possibility, it is just one of several. What makes some other content-changing scenarios interesting is that they are legitimate (for certain values of the word ‘legitimate’), and are not any easier to detect despite their legitimacy.
A case in point: companies market products to Internet service providers, providing ‘in-browser messaging’ for a variety of purposes . The corresponding patents are much more detailed, and specifically say ‘content may be modified or replaced along the path to the user’ .
Back in 2008, Reis et al. took a crack at detecting content modification by using what they called ‘web tripwires’ . Their idea was for a server to provide content as usual, but to include a script to run in the browser that would compare the content received with some ‘ground truth’ version of the same content. As they tested the web tripwires, they were indeed able to spot some content modifications made en route, as well as tripping across some instances where ad-blocking software had thoughtfully injected exploitable vulnerabilities into the content.
Another ‘legitimate’ content modification scenario is censorship. (Done only to protect the children / catch terrorists / facilitate a stable and moral society, you understand; pick your favourite.) There are more than enough examples of censorship to go around – the OpenNet Initiative’s reports  are an informative, if somewhat depressing, place to start reading – but the question is how such content modifications might be detected.
As it happens, December is ACSAC season, i.e. the Annual Computer Security Applications Conference, and a paper included in the proceedings of the 2013 event looked at exactly this problem. Wilberding et al.’s ‘Validating web content with Senser’  works ‘even when SSL/TLS is not supported by the web server’ – a fact I mention because they thought to include that exact phrase in the paper’s abstract and in the introduction, italicized in both places. It has excellent Internet meme potential, I think, e.g. ‘Cleans your dirty dishes… even when SSL/TLS is not supported by the web server.’ When the meme goes viral, just remember you heard it in VB first. But I digress.
What Senser does is build a ‘consensus’ view of the content of a web page by querying a number of proxies distributed around the Internet. The premise is that, as long as the majority of the proxies receive an unfettered view of the web page, a version of it can be reconstructed in the browser. There are a number of practical problems, in that localized or customized content may be delivered even in the absence of censorship, but also the fact that there may be ‘AS-level adversaries who control large segments of the network and may attempt to manipulate web content’ [5, p.340]. Not to name names. Senser devotes considerable effort to diversifying the AS-level paths between the proxies and the server with the content, in an attempt to route around this problem.
An underlying issue I have with systems like Senser (and also Tor) when it comes to censorship, is that meta-data reveals many potentially compromising things. Depending on where a user resides, the simple fact (meta-meta-data, almost) that they used Senser or Tor is suspicious. Until these systems are shipped, enabled by default, in Windows, nothing will change that… and that’s not a happy Disney ending to reach out to school children with.
 PerfTech. Solutions. http://www.perftech.com/solutions.html, last accessed 18 February 2014.
 Donzis, H. M.; Donzis, L. T.; Frey, R. D.; Murphy, J. A.; Schmidt, J. E. Internet connection user communications system. U.S. Patent 8,108,524, 31 January 2012.
 Reis, C.; Gribble, S. D.; Kohno, T.; Weaver, N. C. Detecting in-flight page changes with web tripwires. 5th USENIX Symposium on Networked Systems Design and Implementation, 2008, pp.31–44.
 OpenNet Initiative. https://opennet.net/, last accessed 18 February 2014.