Search engines in research and vulnerability assessment


Alex Eckelberry

Sunbelt Software, USA
Editor: Helen Martin


'Search engines are free, powerful and efficient tools that can be used to find vulnerabilities and hacked sites on the web, and even in your own organization.' Alex Eckelberry, Sunbelt Software.

On 2 October an odd thing happened: state and government websites in California started shutting down. It turned out that the US General Services Administration (GSA) had overreacted somewhat to reports of pornography being hosted on a website for the Transportation Authority of Marin County, and simply pulled the plug on the entire domain.

News that the Marin County website was serving porn came as no great surprise to malware researchers. A number of individuals had contacted the owners of the site in mid-September, alerting them to the problem. Unfortunately these alerts were ignored, as they were believed to be ‘phishing’ attempts.

The primary source of the problem was an apparent DNS hack, which redirected parts of the Marin County website to pornographic sites. These were redirects – the government site itself was not hosting porn. The hack occurred at an outsourced provider, which didn’t have the tightest security practices in place.

As many in the research community know, finding these types of hack is trivial work – it can often be a matter of using simple Google searches such as sex porn site:gov.

Malware researchers are quite familiar with the power of search engines in conducting research. Search engines are free, powerful and efficient tools that can be used to find vulnerabilities and hacked sites on the web, and even in your own organization.

Generally, one sees websites compromised through stolen FTP credentials; unpatched (usually open source) software, including poorly maintained LAMP stacks; the increasing use of collaborative, ‘web 2.0’-type software (wikis, tikis, etc.); DNS hacks; poorly written ASP code; sloppy PHP work and SQL hacks.

So-called ‘Google dorks’ can be useful in finding compromised and malware-hosting websites. The term was coined by Johnny Long on his website, where he described the practice (which had been around for some time) of using Google searches to find ‘dorks’ – people who expose too much information on the web.

Google dorking’ has evolved, and malware researchers continue to fine-tune their searches to find vulnerable websites and malware. Furthermore, by using Google’s Alerts feature, one can input a number of different searches and receive alerts when a matching site is found – useful in finding newly compromised or rogue sites.

As one example, some broader starter queries might be any of the following: inurl:traff (or .info), inurl:in.cgi, inurl:klik, or intitle:"index of" (the last followed by any of a number of terms, such as "love exe", "jpg exe", porn exe, xxx pif, "bot exe" or "gif exe" – hence, a final search might be intitle:"index of" xxx pif -filetype:html -filetype:php -filetype:htm or, as another example, intitle:"index of" "bot exe" -filetype:html -filetype:php -filetype:htm).

Since ‘jump pages’ are often created with the sole purpose of being indexed on search engines and redirecting visitors to other content, running searches on something like might prove useful. Alternatively, searches can be performed for specific directory structures of frameworks used in malware, such as /stata/index.php.

Pornography and malware distributors commonly hack into websites for search engine optimization and increased distribution (ironically, Google's work in marking sites as 'unsafe' in search results is likely driving malware and porn distributors to rely increasingly on hacking 'good sites' to perform redirections to their own bad sites). Finding these hacked sites is similarly trivial. One can simply look for any combination of terms, such as ‘porn, free ringtones, free casino’, followed by some operators to narrow down the search.

Some knowledge of the language used by the distributors also helps – ‘sesso’ and ‘fottilo’, for example, are often used by Italian malware and porn distributors (such as Gromozon). At the time of writing this article, the search sesso OR gratuito porno OR fottilo site:gov produces some rather interesting (and sometimes very dangerous) results.

One can continue to experiment by adding different domains and additional operators to the searches. It’s common to find plenty of comment spam using these methods, but very often you’ll also find compromised websites.

Organizations large and small can use similar searches to find vulnerabilities on their own sites. This holds especially true for larger organizations that work in collaborative environments, such as academic institutions and some governmental organizations. For example, problems with vulnerabilities commonly exist in colleges and universities, where students are often provided with their own websites, academic discourse is encouraged through open source collaborative software, and servers are managed by different groups throughout a campus. It’s a recipe for disaster, and that’s often exactly what happens – finding hacked university websites is almost trite work. IT administrators could complement their security toolboxes with search engines, seeking inappropriate content on their own domains.

In addition to finding malware on the web, there are numerous (and often hair-raising) searches available that can be used to find vulnerabilities on a site. Queries are limited only by creativity, technical acumen and knowledge of data structures.

Finally, there are distinct differences between search engines. Yahoo, Google and present similar data, but sometimes one provides clearer results than the other. has the powerful and unique feature of allowing IP searches. Searching for common malware IPs produces profitable results, such as ip:, ip:, and so on.

Some researchers are frustrated by the inability to search within the source of web pages – which, if provided, would open up a mother lode of information, obviating the need to use proprietary spiders. For now, however, one can find plenty of information using simple searches, enabling the research community to find bad things before they get out broadly to the public – and in the process, hopefully making some impact on the safety of the Internet experience for users.

Sunbelt researchers Francesco Benedini and Adam Thomas contributed to this article.



Latest articles:

VB2016 paper: Debugging and monitoring malware network activities with Haka

Malware analysts have an arsenal of tools with which to reverse engineer malware but lack the means to monitor, debug and control malicious network traffic. This VB2016 paper proposes the use of Haka, an open source security-oriented language, to…

VB2016 paper: One-click fileless infection

There has recently been growing interest in a technique known as fileless infection, where malware authors compromise computers without writing any files to disk. This technique allows the threat to evade detection by file-scanning software while…

Throwback Thursday: Michelangelo - Graffiti Not Art

In early 1992, a boot sector virus captured the imagination of the press and kicked up a media storm. Following a number of reports of the virus spreading in the UK, VB decided to publish an analysis. Fridrik Skulason brought us all the details of…

Throwback Thursday: Once a Researcher...

The author of Flushot, one of the world's first anti-virus programs, Ross Greenberg had already distanced himself from the main AV industry by 1995 - finding himself put off by the antics of certain vendors, whom he considered less than ethical in…

VB2016 paper: APT reports and OPSEC evolution, or: these are not the APT reports you are looking for

While APT reports should have threat actors scrambling to keep up, in reality they are providing APT actors with the information they need to implement new operational security practices and technologies that have defenders working as hard as ever to…