How Spammers Fool Rule-based and Signature-Based Spam Filters
Author: Paul Judge, CTO, CipherTrust, Inc.
Effectively stopping spam over the long-term requires much more than blocking individual IP addresses and creating rules based on keywords that spammers typically use. The increasing sophistication of spam tools coupled with the increasing number of spammers in the wild has created a hyper-evolution in the variety and volume of spam. The old ways of blocking the bad guys just don't work anymore.
Examining spam and spam-blocking technology can illuminate how this evolution is taking place and what can be done to combat spam and reclaim e-mail as the efficient, effective communication tool it was intended to be.
Heuristics (Rule-based Filtering)
One method used to combat spam is Rule-based, or Heuristic Filtering. Rule-based filters scan email content for predetermined words or phrases that may indicate a message is spam. For example, if an email administrator includes the word "sex" on a company's rule-based list, any email containing this word will be filtered.
The major drawback of this approach is the difficulty in identifying keywords that are consistently indicative of spam. While spammers may frequently use the words "sex" and 'Viagra" in spam emails, these words are also used in legitimate business correspondence, particularly in the healthcare industry. Additionally, spammers have learned to obfuscate suspect words by using spellings such as "S*E*X", or "VI a a GRR A".
It is impossible to develop dictionaries that identify every possible misspelling of "spammy" keywords. Additionally, because filtering for certain keywords produces large numbers of false positives, many organizations have found they cannot afford to rely solely on rule-based filters to identify spam.
Signature-Based Spam Filters
Another method used to combat spam is Signature-based Filtering. Signature-based filters examine the contents of known spam, usually derived from honey pots, or dummy email addresses set up specifically to collect spam. Once a honey pot receives a spam message, the content is examined and given a unique identifier. The unique identifier is obtained by assigning a value to each character in the email. Once all characters have been assigned a value, the values are totaled, creating the spam's signature. The signature is added to a signature database and sent as a regular update to the email service's subscribers. The signature is compared to every email coming in to the network and all matching messages are discarded as spam.
The benefit of signature-based filters is that they rarely produce false-positives, or legitimate email incorrectly identified as spam. The drawback of signature-based filters is that they are very easy to defeat. Because they are backward-looking, they only deal with spam that has already been sent. By the time the honey pot receives a spam message, the system assigns a signature, and the update is sent and installed on the subscribers' network, the spammer has already sent millions of emails. A slight modification of the email message will render the existing signature useless.
Furthermore, spammers can easily evade signature-based filters by using special email software that adds random strings of content to the subject line and body of the email. Because the variable content alters the signature of each email sent by the spammer, signature-based spam filters are unable to match the email to known pieces of spam.
Developers of signature-based spam filters have learned to identify the tell-tale signs of automated random character generation. But as is often the case, spammers remain a step ahead and have developed more sophisticated methods for inserting random content. As a result, most spam continues to fool signature-based filters.
The Solution
When used individually, each anti-spam technique has been systematically overcome by spammers. Grandiose plans to rid the world of spam, such as charging a penny for each e-mail received or forcing servers to solve mathematical problems before delivering e-mail, have been proposed with few results. These schemes are not realistic and would require a large percentage of the population to adopt the same anti-spam method in order to be effective. You can learn more about the fight against spam by visiting our website at www.ciphertrust.com and downloading our whitepapers.
About the author:
Dr. Paul Judge is a noted scholar and entrepreneur. He is Chief Technology Officer at CipherTrust, the industry's largest provider of enterprise email security. The company's flagship product, IronMail provides a best of breed enterprise anti spam solution designed to stop spam, phishing attacks and other email-based threats. Learn more by visiting www.ciphertrust.com/products/spam_and_fraud_protection today.
Comments