If you use email, it's likely that you've recently been visited by a piece of spam—an unsolicited, unwanted message, sent to you without your permission. If you manage an email system, it's almost certain that you've had to help your users avoid the deluge of unwanted email.
System administrators pay for spam with their time. The Internet's email system was designed to make it difficult to lose email messages: when a computer can't deliver a message to the intended recipient, it does its best to return that message to the sender. If it can't send the message to the sender, it sends it to the computer's postmaster—because something must be seriously wrong if both the email addresses of the sender and the recipient of a message are invalid.
The well-meaning nature of Internet mail software becomes a positive liability when spammers come into the picture. In a typical bulk mailing, anywhere from a few hundred to tens of thousands of email addresses might be invalid. Under normal circumstances these email messages would bounce back to the sender. But the spammer doesn't want them! To avoid being overwhelmed, spammers often use invalid return addresses. The result: the email messages end up in the mailboxes of the Internet postmasters, who are usually living, breathing system administrators.
System administrators at large sites are now receiving hundreds to thousands of bounced spam messages each day. Unfortunately, each of these messages has to be carefully examined, because mixed in with these messages are the occasional bounced mail messages from misconfigured computers that actually should be fixed.
As the spam problem grows worse and worse, system administrators are increasingly taking themselves off their computers' "postmaster" mailing lists. The result is predictable: they're deluged with less email, but problems that they would normally discover by receiving postmaster email are being missed as well. The Internet as a whole suffers as a result.
Although there are many important ways to reduce spam—including obscuring email addresses, complaining to spammers' service providers, and seeking legal and legislative relief—few remedies are as immediately effective as filtering email messages on the basis of content and format, and few filtering systems are as widely used and well maintained as SpamAssassin™.
This book is for mail system administrators, network administrators, and Internet service providers who are concerned about the growing toll that spam is taking on their systems and their users and are looking for a way to regain some control or reduce the burden on their users.
Scope of This Book
This book is divided into nine chapters and one appendix. The first four chapters deal with core SpamAssassin concepts that are independent of the underlying mail system.
- Chapter 1
- Explains what SpamAssassin does, and provides a conceptual overview of its organization and features.
- Chapter 2
- Covers the installation, testing, and basic operation of SpamAssassin.
- Chapter 3
- Details the configuration of SpamAssassin, and focuses particularly on SpamAssassin's spam-detection rules. It explains how to increase or decrease the impact of rules, write new rules, and add addresses to blacklists and whitelists.
- Chapter 4
- Reviews the learning features of SpamAssassin: automatic whitelisting and Bayesian filtering. It provides the theory behind these features and discusses how to configure, train, and tune them.
The remaining five chapters detail the integration of SpamAssassin with several popular mail transport agents (MTAs) to provide sitewide spam-checking. They also explain how to set up a SpamAssassin gateway to check all incoming mail before delivery to an internal mail host.
- Chapter 5
- Explains how to integrate SpamAssassin with the sendmail MTA, using the milter interface. As an example of this approach, the installation and configuration of MIMEDefang is described.
- Chapter 6
- Explains how to integrate SpamAssassin with the Postfix MTA, using the content_filter interface. As an example of this approach, the installation and configuration of amavisd-new, a daemonized content filter, is described.
- Chapter 7
- Explains how to integrate SpamAssassin with the qmail MTA.
- Chapter 8
- Explains how to integrate SpamAssassin with the Exim MTA using several different popular approaches including custom transports, exiscan, and sa-exim.
- Chapter 9
- Explains how to set up a SpamAssassin POP mail proxy to support users who download their email with POP clients.
The Appendix lists useful resources for more information about SpamAssassin and other antispam approaches.
Versions Covered in This Book
At the time this book went to press, SpamAssassin 2.63 was the latest released version of SpamAssassin and was in wide use. The next-generation release of SpamAssassin, SpamAssassin 3.0, was available for beta-testing and is expected to be released at about the time this book appears in stores. SpamAssassin 3.0 introduces several important new features and changes parts of the Perl API.
Accordingly, this book covers both versions of SpamAssassin. When a topic or setting is specific to one version, I so note it.
Conventions Used in This Book
The following conventions are used in this book:
- Used for Unix file, directory, user, and group names and for Perl modules, objects, method names, and method options. It is also used for example URLs (uniform resource locators) and to emphasize new terms and concepts when they are introduced.
- Constant Width
- Used for Unix commands, code examples, and system output. It is also used for scripts, process names, and SpamAssassin directives.
- Constant Width Italic
- Used in examples for variable input (e.g., a filename you must provide).
- The Unix Bourne shell or Korn shell prompt.
- The Unix superuser prompt. I use this symbol for examples that should be executed by root.
This icon designates a note, which is an important aside to the nearby text.
This icon designates a warning related to the nearby text.
Using Code Examples
All the code in this book is available for download from http://www.oreilly.com/catalog/spamassassin. See the file readme.txt in the download for installation instructions.
This book is here to help you get your job done. In general, you may use the code in this book in your programs and documentation. You do not need to contact us for permission unless you're reproducing a significant portion of the code. For example, writing a program that uses several chunks of code from this book does not require permission. Selling or distributing a CD-ROM of examples from O'Reilly books does require permission. Answering a question by citing this book and quoting example code does not require permission. Incorporating a significant amount of example code from this book into your product's documentation does require permission.
We appreciate, but do not require, attribution. An attribution usually includes the title, author, and publisher; for example: "SpamAssassin, by Alan Schwartz (O'Reilly)."
If you feel your use of code examples falls outside fair use or the permission given previously, feel free to contact us at email@example.com.
Comments and Questions
We have tested and verified the information in this book to the best of our ability, but you may find that features have changed (or even that we have made mistakes!). Please let us know about any errors you find, as well as your suggestions for future editions, by writing to:
O'Reilly Media, Inc.
1005 Gravenstein Highway North
Sebastopol, CA 95472
(800) 998-9938 (U.S. and Canada)
(707) 827-7000 (international/local)
(707) 829-0104 (fax)
You can also contact O'Reilly by email. To be put on the mailing list or request a catalog, send a message to:
We have a web page for this book, which lists errata, examples, and additional information. You can access this page at:
To comment or ask technical questions about this book, send email to:
For more information about O'Reilly books, conferences, Resource Centers, and the O'Reilly Network, see the O'Reilly web site at:
Bob Amen, Justin Mason, and Matt Riffle served as technical reviewers for this book. Any remaining errors, of course, are mine.
I have once again had the pleasure of collaborating with an excellent O'Reilly editor, Jonathan Gennick. The O'Reilly production crew for this book included Darren Kelly, Ellie Volckhausen, and Nancy Crumpton.
This book is dedicated to the developers and user community of SpamAssassin, for their fine work in helping to stem the flood of unwanted email.
Never-ending thanks to M.G. and Ari, who make it all worthwhile.
- ↑ Spam is also a registered trademark of Hormel Foods, which uses the word to describe a canned luncheon meat. In this book, the word "spam" is used exclusively to refer to Internet spam and not the meat.