We know it before you do: predicting malicious domains

Wednesday 24 September 11:30 - 12:00, Red room.

Wei Xu Palo Alto Networks
Yanxin Zhang Palo Alto Networks
Kyle Sanders Palo Alto Networks

This paper is available online (HTML, PDF).

Malicious domains have been used in various attacks from distributing malware to hosting C&C servers and redirecting traffic. Most modern domain reputation systems are designed to detect malicious domains based on evidence (i.e. existing malicious content). One problem is that many of the malicious domains are only used for a very short period of time in order to evading blocking. In other words, many malicious domains have already served most of their purpose by the time the malicious content is detected and the domains are blocked.

As a first step towards solving this problem, we propose a system to predict the domains that are most likely to be used maliciously. Our approach is based on novel research into the connections and patterns exhibited among various detected malicious domains. In summary, we made the following discoveries:

1) Connections between malicious domains that have been in use at different times: We collected and analysed the connections between the malicious domains that were used in different spam/malware campaigns. We identified several types of connections exhibited among these domains based on our PDNS data, whois data, sinkhole data and malware detection data. We used these connections to derive other domains that have not been detected as malicious, and track the inferred domains.

2) Re-use of previous malicious domains: We discovered multiple cases in which previously detected malicious domains are re-used by attackers. We studied the characteristics of these re-used domains. Moreover, we analysed the rationale behind the re-use. Based on our analysis, we derived several patterns and applied the patterns to finding more domains that are likely to be re-used.

3) Temporal patterns of DNS queries of malicious domains: We discovered several patterns in the DNS queries of a domain before the domain was detected as malicious. For different types of malicious domains (i.e. registered by attackers, having malicious content inserted), the patterns indicate different activities related to preparing the domains for malicious purposes. We were able to use these patterns to identify likely malicious domains in our PDNS data feed.

To the best of our knowledge, most of these patterns and discoveries are presented for the first time. To evaluate the effectiveness of this work, we released DNS signatures on the predicted domains and we also tracked the detection of predicted domains on VirusTotal. The results suggest that over 83% of the predicted domains were detected on VirusTotal and were blocked by our firewall.

Wei Xu

Wei Xu is a security reseacher at Palo Alto Networks. His current research interests include web security, network security and security data analysis. His past research works have been published in both academic journals and in industry. He was a speaker at VB2012 and at Black Hat 2013. He received his B.S. and M.S. degrees in electrical engineering from Tsinghua University, Beijing in 2005 and 2007, respectively. He obtained his Ph.D. in computer science from Penn State University in 2013.

Yanxin Zhang

Yanxin Zhang received his Ph.D. in electrical engineering from the Pennsylvania State University in 2009. He joined Palo Alto Networks in 2011. He is the co-creator of the well-known cloud security platform WildFire. He holds three pending patents related to malware analysis and detection. He is in charge of the WildFire and anti-virus system infrastructure.

Kyle Sanders

Kyle Sanders has worked in the IT industry for the last 10 years and is currently the team lead for malware research at Palo Alto Networks. His research interests are in automated malware detection, network forensics and code analysis.