The problem of backscatter – part 3

2008-11-01

Terry Zink

Microsoft, USA
Editor: Helen Martin

Abstract

Terry Zink concludes his series of articles on backscatter with a look at another technique used to combat this type of spam: Bounce Address Tag Validation.


In the first part of this series on backscatter (see VB, September 2008, p.S2), we looked at what backscatter spam is and why it is such a problem. Last month, we looked at some rudimentary techniques for stopping backscatter spam, including content analysis (see VB, October 2008, p.S1). We also looked at some methods we could use to stop ourselves from contributing to the problem. This month, we look at another technique used to combat this type of spam: Bounce Address Tag Validation, or BATV.

Bounce Address Tag Validation

Last month I mentioned that, when a bounce message is received, anti-spam systems could BATV offers a much more secure mechanism for determining whether or not you sent the message. I won’t go into the full technical details, but I will hit on the highlights.

Imagine if you could take a look at a message and determine whether or not you sent it. You can do that to a certain extent by parsing through the Received headers and checking whether they conform to your outbound email standards. For example, do they come from your email servers? Do they have certain idiosyncrasies like special headers? However, you don’t have to do it that way. Table 1 shows the structure of an email.

WhoSpecified in
Originator (authorContent - From/Resent-From
Submitter into transfer serviceContent - Sender/Resent-Sender
Return address (bounces)Envelope - Mail-From Content - Return-Path
Sending relayEnvelope - HELO/EHLO Content - Received header
Receiving relayContent - Received header

Table 1. Structure of an email [1].

Rather than putting just the sender in the MAIL FROM field, BATV specifies that a signature (i.e. an encrypted key) should be added to the MAIL FROM field. The outgoing mail agent adds a signature to the bounce address:

RegularBATV
MAIL FROM mailbox@domainMAIL FROM sig-scheme=mailbox/sig-data@domain
MAIL FROM [email protected]MAIL FROM prvs=me/[email protected]

The advantage here is that the mail server receiving the NDRs and backscatter does not need to rely on the original recipient mail server to perform any verification of the sender. It can all be done at its own end:

  1. The server knows that all of its outgoing mail is signed in the MAIL FROM field.

  2. It receives an inbound message and it appears to be an NDR.

  3. When the RCPT TO information is extracted, it should have the key value pair. If this is decrypted and validated, the message can be accepted because it sent from the mail server originally. There is not even any need to filter it further. If the key value pair does not check out, the message can be discarded because it is spoofed backscatter.

The basic idea behind BATV is that it allows you to verify whether or not NDR bounces originally came from you.

BATV in a nutshell

Figure 1 summarizes how BATV is designed to work to prevent backscatter.

Summary of how BATV is designed to prevent backscatter.

Figure 1. Summary of how BATV is designed to prevent backscatter.

Note the sequence of steps:

  1. I send a message to my co-worker Ritesh and hand it off through our outbound server. However, unbeknownst to me, Ritesh has recently changed his email address.

  2. My outbound server signs my SMTP MAIL FROM by adding a cryptographic tag.

  3. The recipient email server, mail.i_hate_spam.com, sees that the person I am delivering to, Ritesh, does not exist.

  4. The mail server accepts the message, but then bounces it back with a null sender and puts the original, signed, MAIL FROM information into the RCPT TO field.

  5. When the message reaches my inbound mail server, it sees that nospam.com is an outbound customer. Indeed, it is my domain. My mail server determines that the message is a bounce. It decrypts the RCPT TO information which is subsequently verified, so it accepts the message and it is delivered straight to my inbox.

  6. Meanwhile, evil spammer Mark Q. Spammer sends a message to Ritesh at mail.i_hate_spam.net while forging my address.

  7. Mail.i_hate_spam.net accepts the message, discovers that it can’t deliver it (because Ritesh doesn’t exist there either) and then bounces it back to me since I appear to be the one who sent the message.

  8. When the bounced message hits my inbound email server, the server sees that I am an outbound customer and that the message is an NDR. However, because the RCPT TO field is not signed, and my server knows that all genuine outbound mail from customers is signed, the message is rejected.

That’s BATV in a nutshell.

BATV and Sender Policy Framework (SPF)

BATV is one of the better mechanisms available to stop backscatter. The question now is how do we use it? What potential problems are associated with BATV?

One problem is that unless you have an SPF policy that dictates a hard fail on your outbound mail, BATV doesn’t necessarily work. The reason is that if you don’t know where your outgoing mail is coming from, you can’t necessarily say it didn’t come from you if it isn’t signed in a bounce.

For example, if your SPF record is this:

v=spf1 ip4:10.10.10.0/24 -all

then you know that all your outgoing mail comes only from those IP addresses. Everyone you send mail to also knows that mail from you comes only from those IP addresses and therefore your receivers should hard fail (reject) any mail that claims to come from you but which is outside of those IPs. Since you know which IPs you send mail from, you know that you always sign mail from those IPs as well. Thus, a bounce message that isn’t signed means that it didn’t come from those IPs; you can ‘hard fail’ the bounce message. It’s a little like a secondary SPF check.

However, suppose your SPF record is one of the following:

v=spf1 ip4:10.10.10.0/24 ~all

or

v=spf1 ip4:10.10.10.0/24 ?all

In the former case, a soft fail ‘~all’ means that if mail appears to come from you, but is outside your IP range, then it probably didn’t come from you. It should be accepted, but marked as suspicious. In the latter case, a neutral fail ‘?all’ means that if the receiver gets a mail from those IPs then it definitely came from you, but if they receive mail from outside those IPs then it may or may not have come from you – i.e. you are not entirely sure which IPs you use to send outbound mail. Thus, you neither confirm nor deny anything about mail claiming to come from you that is outside those IP ranges.

And therein lies the problem for these two cases. If you can’t say for sure which IP range your mail comes from, then you can’t be sure that all of your outbound mail is signed. If you can’t say that all of your outbound mail is signed, then you can’t reject it using BATV. An unsigned message doesn’t necessarily mean that the message didn’t come from you – it says so right there in your SPF policy. You’d have to parse email content in order to figure out where it came from. If you could do that, then you could implement some conditional logic because you know that messages from a certain set of IPs are signed on the outbound. This is starting to get a little convoluted, however, and it is prone to failure because you have to rely on recipient MTAs to send back all of the necessary received headers – and if you could do that (figure out where it came from by parsing and trusting the original received headers), you wouldn’t need BATV.

Alternatively, you could specify that if a bounce is signed and passes a BATV check, the message should be accepted without further filtering. Conversely, if it isn’t signed and you know that it is a bounce, it should be filtered more aggressively (i.e. don’t de-spamify a message classification). The problem is that this takes us back to the issue of false positives; although you’ll probably have fewer false positives anyhow because the ones you want are probably sent from your known good IP range.

I’m not really a big fan of filtering messages more aggressively, I’m just saying that you could do it this way if you had soft fail or neutral SPF policies. The Holy Grail of filtering is to accept the messages you trust and take a harder line on those that you do not. However, in my experience, being more aggressive on untrusted messages just means you add more spam points to the ones you would have caught as spam anyway, and the messages you aren’t sure about just end up as false positives.

Limitations of BATV

While BATV is a good technique, we’ve seen that it does have some limitations when combining it with an SPF policy. What else do we have to consider with BATV?

  • Catch-all addresses or non-deliverable addresses. Some MTAs will look up the recipient in the SMTP conversation. For example, in a hosted service, some companies will upload their valid email addresses and upon receiving an inbound message, the hosted service checks to see if the user to whom the sender is delivering exists. This allows the customer not to have to deal with a bounce; instead it’s done upstream. Similarly, catch-all addresses will deliver non-existent mail to the catch-all instead of bouncing it.

    Because BATV changes the recipient email address on all bounces, you need to make sure that your MTA parses the BATV-signed recipient address properly. Otherwise, your MTA will receive the incoming message, check the recipient against a list of valid email addresses and say ‘No, it doesn’t exist because it’s got this prvs=012345AbCd= in front of it, and none of my valid addresses contain that.’ So, you need to make sure that you upgrade your inbound MTA to make sure it strips the leading BATV tag before performing a lookup.

The next few points are paraphrased from the Internet draft [2].

  • Mailing lists. BATV will cause problems with some mailing lists that identify posters by their bounce address. The list will not recognize the identical MAIL FROM addresses, because it will interpret the differing BATV attributes as part of the address. These services will either reject postings or pass them all to the moderator.

  • Greylisters. Greylisting is sending a 4xx-level notification to a sender which means ‘Hey go away, come back later’ and is based on the theory that a spammer won’t return, but a legitimate sender will. A correct BATV implementation will only result in routine delays in this case. However, the result of BATV tagging MUST be a constant local-part, for a given message, and not (say) be created at delivery time such that each retry gets a different validation string, which would prevent it from ever getting through to a greylisting site.

  • Whitelisting/safe senders. If you send outbound mail and suddenly start signing it, people who have whitelisted your MAIL FROMs will suddenly stop recognizing your mail because the MAIL FROM will be different every day. The solution to this, of course, is to update your MTA software such that it supports BATV and is capable of stripping the BATV component of the MAIL FROM before performing sender lookups.

  • Challenge/response systems. Challenge/response (C/R) systems are systems where if you send an email to someone, they bounce it back to you requesting that you click a link to verify that you are a real human and not a spammer. Only once you have done that will the message be delivered to the recipient. The problem BATV poses here is that each signed message can have a different MAIL FROM so, whenever you change the keys, the C/R?protected email inbox will issue you a new challenge. This becomes very annoying for the sender.

To summarize, it is advisable to make a list of all the possible things that can go wrong before implementing BATV. Unintended consequences can cause a major headache if customers start to complain and you have to roll out a feature again.

Wrapping it up

Backscatter spam is annoying. It’s tough to filter because its contents can fool content filters, and it can fool end-users too.

Indeed, if your content filter could recognize an NDR and ignore the parts that typically occur in NDRs, you could filter the rest of the message normally and make the spam/not-spam classification that way.

When it comes to NDRs and Delivery Status Notifications, the key thing to remember is to treat them as a subclass of actual email. It’s not marketing, it’s not business mail, it’s not a personal communication, it’s simply a notification that mail that you sent did not get delivered in the way you expected.

We’ve seen a number of ways to filter the mail, some better than others. Ultimately, what it comes down to is treating bounce messages differently from regular inbound mail and making decisions based upon that special categorization of email. The rules of normal inbound filtering are modified because that’s a better way to evaluate it.

twitter.png
fb.png
linkedin.png
hackernews.png
reddit.png

 

Latest articles:

Nexus Android banking botnet – compromising C&C panels and dissecting mobile AppInjects

Aditya Sood & Rohit Bansal provide details of a security vulnerability in the Nexus Android botnet C&C panel that was exploited to compromise the C&C panel in order to gather threat intelligence, and present a model of mobile AppInjects.

Cryptojacking on the fly: TeamTNT using NVIDIA drivers to mine cryptocurrency

TeamTNT is known for attacking insecure and vulnerable Kubernetes deployments in order to infiltrate organizations’ dedicated environments and transform them into attack launchpads. In this article Aditya Sood presents a new module introduced by…

Collector-stealer: a Russian origin credential and information extractor

Collector-stealer, a piece of malware of Russian origin, is heavily used on the Internet to exfiltrate sensitive data from end-user systems and store it in its C&C panels. In this article, researchers Aditya K Sood and Rohit Chaturvedi present a 360…

Fighting Fire with Fire

In 1989, Joe Wells encountered his first virus: Jerusalem. He disassembled the virus, and from that moment onward, was intrigued by the properties of these small pieces of self-replicating code. Joe Wells was an expert on computer viruses, was partly…

Run your malicious VBA macros anywhere!

Kurt Natvig wanted to understand whether it’s possible to recompile VBA macros to another language, which could then easily be ‘run’ on any gateway, thus revealing a sample’s true nature in a safe manner. In this article he explains how he recompiled…


Bulletin Archive

We have placed cookies on your device in order to improve the functionality of this site, as outlined in our cookies policy. However, you may delete and block all cookies from this site and your use of the site will be unaffected. By continuing to browse this site, you are agreeing to Virus Bulletin's use of data as outlined in our privacy policy.