SpamAssassin/Integrating SpamAssassin with Exim

From WikiContent

< SpamAssassin
Revision as of 10:53, 7 March 2008 by Docbook2Wiki (Talk)
(diff) ←Older revision | Current revision (diff) | Newer revision→ (diff)
Jump to: navigation, search
SpamAssassin

Exim is an MTA developed by Philip Hazel at the University of Cambridge. Exim is designed for Internet mail hosts and provides flexibility, performance, and strong access controls. It has become a popular replacement for sendmail because it provides a compatible command-line interface.

This chapter explains how to integrate SpamAssassin into an Exim-based mail server to perform spam-checking for local recipients or to create a spam-checking mail gateway.

Warning

Exim is a complex piece of software and, more than most MTAs, has an extensive and complicated set of configuration options. This chapter assumes that you are running Exim 4 and does not cover how to securely install, configure, or operate Exim itself. For that information, see the Exim documentation, the web site http://www.exim.org, and the book The Exim SMTP Mail Server: Official Guide for Release 4by Philip Hazel (UIT Cambridge).

Exim consists primarily of a single setuid executable, exim, that performs different functions depending on its command-line arguments. These functions include listening on the SMTP port and receiving and enqueuing incoming messages, adding locally generated messages to the queue, and processing the queue to transmit outgoing messages. When compiled from source code, exim is installed in /usr/exim/bin, and the examples in this chapter assume that directory is used.

Exim's configuration file defaults to /usr/exim/configure. The configuration file determines the behavior of Exim and defines three important logical entities: access control lists (ACLs), routers, and transports. ACLs define tests that can be performed during incoming SMTP sessions to determine whether Exim will accept a message. Routers determine how messages to a given address should be delivered (or rewritten to new addresses) and queue them up for transports. Transports define delivery mechanisms—methods by which a message can be copied from Exim's queue to a local mailbox, a remote host, or elsewhere. Each of these entities has its own section in the configuration file.

Tip

While you can define ACLs and transports in any order, you must define routers in the order in which they are to run. In the default configuration, the router order is dnslookup (look up remote hostnames and route messages via SMTP), system_aliases (redirect messages on the basis of the /etc/aliases file), userforward (redirect messages on the basis of user .forward files), and localuser (route message via the local delivery agent).

Contents

Spam-Checking via procmail

One easy way to add SpamAssassin to an Exim system is to configure Exim to use procmail as its local delivery agent. Then add a procmail recipe for spam-tagging to /etc/procmailrc.

The advantages of this approach are

  • It's very easy to set up.
  • You can run spamd, and the procmail recipe can use spamc for faster spam-checking.
  • User preference files, autowhitelists, and Bayesian databases can be used.

However, Exim runs a local delivery agent only for email destined for a local recipient. You cannot create a spam-checking gateway with this approach.

To configure Exim to use procmail for local delivery, add the following transport to the Exim configuration file (in the transports section):

  procmail_pipe:
    driver = pipe
    command = /usr/local/bin/procmail -d $local_part
    return_path_add
    delivery_date_add
    envelope_to_add
    check_string = "From "
    escape_string = ">From "
    user = $local_part
    group = mail
         

Ensure that you provide the proper path to procmail and an appropriate group for running procmail in the definition of the procmail_pipe transport.

Next add a new router to direct messages to the procmail_pipe transport. This router should be added to the routers section of the configuration file before (or in place of) the localuser router, which is usually the last router.

  procmail:
    driver = accept
    check_local_user
    transport = procmail_pipe
    cannot_route_message = Unknown user
    no_verify
    no_expn

Local addresses that reach the procmail router will be accepted and delivered via the procmail_pipe transport, which invokes procmail in its role as a local delivery agent.

Warning

After any change to the Exim configuration file, you must send a SIGHUP signal to the Exim daemon to cause it to reread the configuration file. You can test configuration changes before you do this by running exim -bV.

Next, configure procmail to invoke SpamAssassin. If you want to invoke SpamAssassin on behalf of every user, do so by editing the /etc/procmailrc file. Example 8-1 shows an /etc/procmailrcthat invokes SpamAssassin.

Example 8-1. A complete /etc/procmailrc

DROPPRIVS=yes
PATH=/bin:/usr/bin:/usr/local/bin
SHELL=/bin/sh

# Spamassassin
:0fw
* <300000
|/usr/bin/spamassassin

If you run spamd, replace the call to spamassassin in procmailrc with a call to spamc instead. Using spamc/spamd significantly improves performance on most systems but makes it more difficult to enable users to write their own rules.

Spam-Checking All Incoming Mail

If you want to set up a spam-checking gateway for all recipients, local or not, you need a way to perform spam-checking as mail is received, before final delivery. Exim provides three different ways to do this: via routers, via exiscan, and via defining a local_scan( ) function.

In a router-based configuration, SpamAssassin is invoked after Exim has received a message during the process of routing each delivery address in the message. If the message is destined for a local user, SpamAssassin can use per-user preference files; if the message will be relayed to a remote user, SpamAssassin still checks the message using sitewide settings. In this configuration, SpamAssassin may be invoked several times for each message received (once for each message recipient). Figure 8-1 illustrates this configuration.

Figure 8-1. A router-based configuration for spam-checking in Exim

A router-based configuration for spam-checking in Exim

In an exiscan configuration, Exim invokes SpamAssassin during the SMTP transaction by means of a new ACL. Messages that SpamAssassin considers spam can be rejected before the SMTP transaction is complete, or accepted and tagged. However, you cannot use per-user preferences in this configuration without negatively impacting performance. Figure 8-2 illustrates this approach.

Figure 8-2. An exiscan-based configuration for spam-checking in Exim

An exiscan-based configuration for spam-checking in Exim

In a configuration using local_scan( ), Exim invokes SpamAssassin during the SMTP transaction when it calls the local_scan( ) function for the incoming message. The message can be accepted or rejected in the SMTP transaction; if local_scan( ) accepts the message, tagging headers can be added. Other interesting effects, including teergrubing—responding very slowly during the SMTP transaction when spam is detected in order to tie up the spammer's MTA—are possible with this approach, but it is difficult to use per-user preferences in this configuration. Figure 8-3 illustrates this approach.

Figure 8-3. A local_scan( )-based configuration for spam-checking in Exim

A local_scan( )-based configuration for spam-checking in Exim

Each of these methods is described in detail in the following sections.

Using Routers and Transports

You can configure Exim to pass all incoming mail through SpamAssassin by writing a transport that pipes messages to SpamAssassin and then reinjects them into Exim, and a router that directs messages to the transport. To prevent the reinjected messages from being spam-checked again, you can set their $received_protocol to indicate they've been checked when you reinject them, and use the $received_protocol value as a condition to determine whether or not the router will send them for checking.

Configuring the Transport

Example 8-2 shows the configuration of the transport in /usr/exim/configure.

Example 8-2. A transport for spam-checking

spamassassin:
  driver = pipe
  use_bsmtp = true
  command = /usr/exim/bin/exim -bS -oMr sa-checked
  transport_filter = /usr/bin/spamc -f
  home_directory = "/tmp"
  current_directory = "/tmp"
  user = exim
  group = exim
  log_output = true
  return_fail_output = true
  return_path_add = false

The spamassassin transport in Example 8-2 uses Exim's pipe driver to deliver a message to a command. The example specifies that Exim should use the batched SMTP (BSMTP) format to transmit the message. The command is another invocation of exim itself, with the -bS option to accept BSMTP input and the -oMr sa-checked option to set the $received_protocol variable to sa-checked. Before Exim pipes the message to the command, it filters the message through the program specified by transport_filter—in this case, spamc—and uses the output of the filter as the message to deliver. The other transport options provide home and working directories for running the command, specify that the command should be run as user and group exim, cause command output to be logged and any failure messages to be included in a bounce message, and indicate that a Return-Path header should not be added (because this transport is not performing final delivery).

You must specify that the exim command used in the transport will be run as one of Exim's trusted users in order for the -oMr sa-checked option to work. The Exim user (specified during Exim's installation) is always trusted. You can add other trusted users in the configuration file with the trusted_users or trusted_groups directives.

Configuring the Router

The transport provides a mechanism for Exim to filter messages through SpamAssassin and reinject them. You must also define a router that will invoke this transport during delivery. Example 8-3 displays a definition for such a router in /usr/exim/configure.

Example 8-3. A spam-checking router in Exim

spamassassin_router:
  driver = accept
  transport = spamassassin
  condition = "${if {!eq {$received_protocol}{sa-checked}} {1} {0}}"
  no_verify
  no_expn

The spamassassin_router in Example 8-3 uses the accept driver, which simply delivers a message to a transport. The transport directive specifies our spamassassin transport. The condition directive prevents a spam-checking loop when messages are reinjected by insuring that the value of $received_protocol is not sa-checked. The no_verify and no_expn directives instruct Exim to skip this router when performing address verification or expansion.

Add the router definition from Example 8-3 to the section of /usr/exim/configurethat lists routers. The order of the router definitions is significant. Where you add the spamassassin_router router in the list determines which messages will be checked, as shown in Table 8-1. Most sites will probably want to add the router between system_aliases and userforward (or possibly between userforward and a procmail router), but spam-checking gateways are likely to need the router before dnslookup as nearly all of their mail will be destined for remote sites.

Table 8-1. Effect of the position of spamassassin_router in the Exim router list

Position Effect
First in the list (before dnslookup) SpamAssassin invoked on all messages, including local deliveries, outgoing messages, and messages relayed to remote hosts.
Between dnslookup and system_aliases SpamAssassin invoked on messages with addresses in locally hosted domains. System aliases and user .forward files will receive messages already spam-checked (and can act on tagging).
Between system_aliases and userforward SpamAssassin invoked on messages with addresses in locally hosted domains, unless system alias file redirected them to a remote host. User .forward files will receive messages already spam-checked (and can act on tagging).
Between userforward and localuser SpamAssassin invoked only on messages that will be delivered locally. User .forward files will receive messages without spam-checking. Spam-checked messages will be delivered to local mailbox.
After localuser Too late! Messages will already have been delivered.

Using Per-User Spam-Checking Preferences

Because Exim routes each delivery address separately, you can configure it to behave differently for messages that will be delivered locally and messages that will be relayed to remote hosts. You can take advantage of this flexibility to direct SpamAssassin to use per-user preferences when checking a message that is destined for a local user and to use sitewide preferences when checking a message that is destined for a remote user. This approach requires a second transport and a second router. Add another transport such as that shown in Example 8-4 to your Exim configuration file.

Example 8-4. Transport for local spam-checking in Exim

spamassassin_local:
  driver = pipe
  use_bsmtp = true
  command = /usr/exim/bin/exim -bS -oMr sa-checked
  transport_filter = /usr/bin/spamc -f -u $local_part
  home_directory = "/tmp"
  current_directory = "/tmp"
  user = exim
  group = exim
  log_output = true
  return_fail_output = true
  return_path_add = false

The key addition in the spamassassin_local transport is the use of spamc's -u user command-line option to specify the user on whose behalf spamc is running. spamc will convey the username to spamd, which will examine the user's .spamassassin/user_prefs file for preferences.

Warning

For this transport to work, spamd must be able to read users' preference files. Because you should run spamd under a dedicated user and group, this user or group must be able to search the .spamassassinsubdirectory of each user's home directory and read the user_prefs file. (You may instead run spamd as root, but using a dedicated user is a better security practice.)

You must not invoke spamd with the --nouser-config or --auth-ident options when using this transport. If you use --nouser-config, spamd will ignore spamc's -u argument, and user preferences will not be examined. If you use --auth-ident, spamd will attempt to confirm that spamc is being run by the user given in its -u argument. Because Exim runs as its own user, the authentication will fail and spamd will refuse to look up user preferences.

Next, add a router that uses the spamassassin_local transport, as shown in Example 8-5.

Example 8-5. A spam-checking router with user preferences in Exim

spamassassin_local_router:
  driver = accept
  transport = spamassassin_local
  condition = "${if {!eq {$received_protocol}{sa-checked}} {1} {0}}"
  no_verify
  no_expn

You should also modify spamassassin_router to limit its use to non-local domains. This modification is shown in Example 8-6.

Example 8-6. A spam-checking router for non-local domains in Exim

spamassassin_router:
  driver = accept
  transport = spamassassin
  domains = ! +local_domains
  condition = "${if {!eq {$received_protocol}{sa-checked}} {1} {0}}"
  no_verify
  no_expn

Arrange the routers in the following order:

  1. spamassassin_router, to perform spam-checking for messages addressed to remote domains
  2. dnslookup, to route messages addressed to remote domains via SMTP
  3. system_aliases, to redirect messages with addresses in /etc/aliases
  4. spamassassin_local_router, to perform spam-checking for messages addressed to local users with the per-user preferences of the local user (who may, however, choose to forward the tagged message elsewhere)
  5. userforward, to redirect messages with addresses in user .forward files
  6. localuser, to route messages via the local delivery agent

To illustrate how this approach functions, consider an Exim system running on mail.example.com and configured to relay messages for example.com to an internal mail server. On mail.example.com, postmaster is an alias for the local user chris. When a spammer sends a message addressed to sam@example.com and postmaster@mail.example.com, Exim passes each address through its list of routers. sam@example.com is routed by spamassassin_router, so a copy of the message is tagged by SpamAssassin using its sitewide configuration and then reinjected. The reinjected message bypasses spamassassin_router and is routed by dnslookup, which queues it for remote delivery. Meanwhile, postmaster@mail.example.com is destined for a local domain and bypasses both spamassassin_router and dnslookup. The system_aliases router rewrites the address to chris@mail.example.com, which Exim then begins routing. This address bypasses spamassassin_router, dnslookup, and system_alias and is routed by spamassassin_local_router, which tags a copy of the message using chriss SpamAssassin preferences and reinjects it. The reinjected message bypasses spamassassin_router, dnslookup, system_alias, and spamassassin_local_router, and assuming chris does not have a .forward file, Exim delivers it to chriss local mailbox. Figure 8-4 illustrates this process.

Figure 8-4. Exim router lookups during delivery

Exim router lookups during delivery

Using exiscan

One of Exim's most powerful and flexible features is its ACL system. Each ACL is a set of rules or tests that Exim performs when receiving a message; for example, an ACL is available for each stage of the SMTP transaction (start of connection, after HELO, after MAIL FROM, etc.). Rules are evaluated in order until one matches, and the associated action is then performed. Actions can include allowing the transaction to proceed, deferring the transaction, rejecting the transaction, ignoring the transaction, adding warning headers to the message, or dropping the connection altogether. If no rule matches, the ACL rejects the corresponding portion of the SMTP transaction.

exiscan is a set of patches for Exim that introduces the ability to invoke SpamAssassin in the acl_smtp_data ACL that Exim consults after the DATA step of an SMTP transaction. You can download exiscan from http://duncanthrax.net/exiscan-acl/; many precompiled versions of Exim (e.g., in Linux distributions) have the patch already applied. exiscan's new ACL actions also include blocking MIME attachments, virus-checking, and checking headers against regular expressions.

Installing exiscan

If you're not using a version of Exim that has exiscan already compiled in, you should download the exiscan patch file and apply it to your Exim source code with the GNU patch program. Example 8-7 shows the patch process, assuming that both the Exim source code and the patch are in /usr/local/src. Stop and restart Exim after you install the patched version.

Example 8-7. Patching the Exim source code with exiscan

$ cd /usr/local/src/exim-4.30
$ patch -p1 -s < ../exiscan-acl-4.30-14.patch
$ rm -rf build-*
$ make
...Compilation messages...
$ su
Password: 
                     XXXXXXXX
                  
# make install
               

Tip

The rm -rf build-* command removes any old Exim build directories that may be present and forces Exim's Makefile to recreate them and repopulate them with symbolic links to source code files. This is important, because exiscan adds new source code files that would otherwise not have links in the build directory.

Writing acl_smtp_data

exiscan extends Exim's ACL language by adding a new rule, spam, that makes a connection to spamd to request a message check on behalf of a specified user and returns true if the message would exceed the user's SpamAssassin spam threshold. Example 8-8 shows a simple acl_smtp_data that uses the spam condition to add an X-Spam-Flag: YES header to spam messages.

Example 8-8. Adding an X-Spam-Flag header with exiscan

acl_smtp_data:

  warn  message = X-Spam-Flag: YES
        spam = nobody

In this ACL, the condition spam = nobody invokes spamc as the user nobody. If the message's spam score exceeds nobody's threshold, Exim takes the warn action, adding the X-Spam-Flag header. Similarly, the following ACL rule will generate a second Subject header with a spam tag for spam messages.

warn  message = Subject: *SPAM* $h_Subject
      spam = nobody

Tip

ACLs can add headers but cannot remove them or modify them in situ. To replace the Subject header with a tagged version, you must add a new header through the ACL (e.g., X-Spam-Subject) and direct Exim's system filter to replace the message subject with the new header if it's present. An example of how to do this is included with the exiscan documentation.

The spam condition also sets several useful Exim variables as a side effect:

$spam_bar
If SpamAssassin gives a message a positive spam score, exiscan sets this variable to a string of plus (+) characters, with one plus for each point of spam score, up to 50. If SpamAssassin gives a message a negative spam score, exiscan sets this variable to a string of minus characters (-), with one minus for each negative point of spam score. If SpamAssassin gives a message a zero spam score, exiscan sets this variable to a slash (/) character.
$spam_report
The full SpamAssassin report on a message.
$spam_score
The score assigned to a message by SpamAssassin.
$spam_score_int
The score assigned to a message by SpamAssassin multiplied by 10. exiscan stores this variable in the message's spool file, so Exim can use this value in later processing (e.g., in routers) to handle high-scoring messages differently than low-scoring messages.

These variables can be used with warn or deny actions to implement several kinds of spam policies. Example 8-9, adapted from the exiscan documentation, shows how you can direct Exim to add an X-Spam-Score header for all messages, to add an X-Spam-Report header for spam, and to reject a message completely if the spam score is higher than 12.

Example 8-9. Spam policies with exiscan

warn  message = X-Spam-Report: $spam_report
      spam = nobody

warn  message = X-Spam-Score: $spam_score ($spam_bar)
      spam = nobody:true

deny   message = This message scored $spam_score spam points.
       spam = nobody
       condition = ${if >{$spam_score_int}{120}{1}{0}}

The first rule performs spam-checking and adds the X-Spam-Report header if a message exceeds the spam threshold. exiscan caches the spam-checking results, so future calls to the spam condition for this message will not actually recheck the message. The second rule uses the :true option, which causes the condition to be evaluated as true regardless of the results of the spam check. Accordingly, Exim will add an X-Spam-Score header to all messages. Finally, Exim executes the deny action (refusing the message with the given text added to the SMTP rejection response) if the $spam_score_int is greater than 120 (which corresponds to a SpamAssassin score greater than 12.0).

Using Per-User Preferences

Because exiscan checks messages for spam just once—at message receipt after the SMTP DATA command—it's difficult to use SpamAssassin's per-user preference files. Messages may have multiple recipients, some of whom are not local, and exiscan will not be able to determine whose preferences should be used.

You can continue to use per-user preferences with exiscan in two ways, but each comes at a performance cost.

  • You can ensure that each email message will have only a single recipient by writing an ACL for the SMTP RCPT TO phase that defers all recipients except the first one. The sending MTA will retry delivery to the deferred recipients but may not do so immediately. As a result, some copies of messages with multiple recipients may be significantly delayed. The exiscan documentation includes an example of how to do this.
  • You can use exiscan to perform initial spam-checking and refuse messages with high scores, and then use the router/transport approach described earlier to reinvoke SpamAssassin on the remaining messages for local recipients. This approach results in an extra spamd connection for each message with a local recipient but might be worthwhile if exiscan can refuse enough very obvious spam sent to multiple recipients.

Using sa-exim

Exim calls its local_scan( ) function once just before accepting a message (via SMTP or from a local process). By default, this function does nothing—the implementation of the function in Exim's source code simply instructs Exim to accept the message. What makes local_scan( ) powerful is that you can replace Exim's version with your own code to perform custom message-checking. This function can be a good place to perform spam-checking.

Even better, you don't have to write a new local_scan( ) yourself if you want to invoke SpamAssassin. Marc Merlin has written one for you: sa-exim. sa-exim invokes spamc in its local_scan( ) function and can thus take advantage of all of spamd's configuration options. This section describes the installation and configuration of sa-exim. You can download it at http://sa-exim.sf.net. It requires Exim 4.11 or later.

Buiding sa-exim for Static Integration

Once you've unpacked the source code, you can choose one of two approaches to integrating sa-exim with Exim. This section focuses on static integration, which embeds sa-exim within Exim at compile time. The examples in this section assume you have unpacked Exim's source code in /usr/local/src/exim-4.30 and sa-exim's in /usr/local/src/sa-exim-3.1.

Tip

Whichever approach you choose for integrating sa-exim, be sure that LOCAL_SCAN_HAS_OPTIONS has not been set to yes in Exim's Local/Makefile (it is not set by default).

To use the static integration approach, you edit sa-exim's sa-exim.c file, then replace Exim's src/local_scan.c file with sa-exim's sa-exim.c file, copy sa-exim's sa-exim.h file to the same location, and recompile (and reinstall) Exim. The local_scan( ) function in sa-exim.c replaces the default function.

Two macro definitions in sa-exim.c must be edited. They appear in the code under the comment "Compile time config values" and provide the location of spamc (by default, /usr/bin/spamc) and sa-exim's own configuration file (by default, /etc/exim4/sa-exim.conf, but you might change this location to /usr/exim/sa-exim.conf or /etc/sa-exim.conf as suits your system).

$ cd /usr/local/src/sa-exim-3.1
...Edit sa-exim.c in your favorite editor...
$ make sa-exim.h
echo "char *version=\"`cat version` (built `date`)\";" > sa-exim.h
$ cp sa-exim.c ../exim-4.30/src/local_scan.c
$ cp sa-exim.h ../exim-4.30/src
$ cd ../exim-4.30
$ make
$ su
Password: 
                  XXXXXXXX
               
# make install
            

The static integration approach is easy but requires you to recompile Exim whenever you want to update sa-exim.

Building sa-exim for Dynamic Integration

Using the dynamic integration approach, you patch Exim to allow the local_scan( ) function to be dynamically loaded at runtime, and you compile sa-exim as a dynamically loadable executable. Many packaged versions of Exim are distributed with the dynamic loading patch already applied, but sa-exim includes two versions of the patches by David Woodhouse that you can apply to your Exim source code yourself. Use localscan_dlopen_up_to_4.14.patch to patch Exim versions 4.11 to 4.14; use localscan_dlopen_exim_4.20_or_better.patch to patch Exim 4.20 and later versions. Example 8-10 illustrates the patch process.

Example 8-10. Patching Exim to support dynamic loading

$ cd /usr/local/src/exim-4.30
$ patch -p1 < ../sa-exim-3.1/localscan_dlopen_exim_4.20_or_better.patch
patching file src/EDITME
Hunk #1 succeeded at 505 (offset 117 lines).
patching file src/config.h.defaults
Hunk #1 succeeded at 20 (offset 3 lines).
patching file src/globals.c
Hunk #1 succeeded at 108 (offset 5 lines).
patching file src/globals.h
Hunk #1 succeeded at 72 (offset 5 lines).
patching file src/local_scan.c
patching file src/readconf.c
Hunk #1 succeeded at 224 (offset 42 lines).
$ make
$ su
Password: 
                     XXXXXXXX
                  
# make install
               

After installing the patched Exim, compile sa-exim as a dynamically loadable object file by editing its Makefile. Check that the definitions of CC, CFLAGS, and LDFLAGS are suitable for building a shared object file with your compiler. Set the following macros in theMakefile:

SACONF
The path where you will locate sa-exim's configuration file (e.g., /etc/exim4/sa-exim.conf, /usr/exim/sa-exim.conf, or whatever suits your system)
SPAMC
The location of spamc (e.g. /usr/bin/spamc)
EXIM_SRC
The path to the Exim source code's src directory (e.g., /usr/local/src/exim-4.30/src)

Run make to compile sa-exim; make should produce the shared object files sa-exim-3.1.soand accept.so. The former is the sa-exim replacement for the local_scan( ) function. The latter is a replacement for local_scan( ) that simply accepts all messages; you can use accept.so to test that dynamic loading works properly without the complexities of sa-exim.

Copy these shared object files to an appropriate Exim directory (e.g., /usr/exim or /usr/exim/libexec), and add the following lines to the beginning of Exim's configuration file:

   local_scan_path = /usr/exim/accept.so
   #local_scan_path = /usr/exim/sa-exim-3.1.so

Restart Exim, and confirm that messages are being received. After you finish configuring sa-exim, edit Exim's configuration file again, comment out the accept.so line, uncomment the sa-exim.soline, and restart Exim again to activate sa-exim.

Configuring SpamAssassin for sa-exim

sa-exim invokes SpamAssassin using spamc, so you must be running the spamd daemon to use sa-exim.

sa-exim behaves as you'd expect with most of the settings you'd be likely to have in your sitewide configuration file (typically /etc/mail/spamassassin/local.cf). One that requires particular care, however, is the report_safe setting.

If you set report_safe to 0, SpamAssassin only adds spam-tagging headers and does not modify the body of messages. This setting works with sa-exim without any additional configuration and provides the fastest message-checking performance.

If you prefer to have SpamAssassin modify the body of the message to add its report and convert the original message into an attachment, you can set report_safe to 1 (include original message as message/rfc822 attachment) or 2 (include original message as text/plain attachment). In this case, you have to set the SARewriteBody variable in sa-exim.conf (described in the next section). Because sa-exim must read the modified body back from SpamAssassin, message-checking will be slightly slower than with report_safe 0. In addition, if you perform message-archiving, the archives will contain the SpamAssassin-modified message.

Finally, ensure that spamd is not being invoked with the --create-prefs option, as it should run as an unprivileged user and be unable to create user preference files anyway. You may wish to include the --nouser-config option as well.

Configuring sa-exim

You configure sa-exim by editing its sa-exim.conf configuration file. During the build of sa-exim, you should have specified a location for this file. Begin configuration by copying the sa-exim.conf file included with the sa-exim source code to this location. Edit the file to configure sa-exim.

The sa-exim.conf file is copiously commented. As the first comment describes, sa-exim is picky about the formatting of options in this file. For example, the following are examples of valid options in sa-exim.conf:

SApermreject: 12.0
SARewriteBody: 0
# The option below is commented out, and thus not set
#SApermrejectsave: /var/spool/exim/SApermreject

But none of this next set of options are valid:

# No spaces are allowed before the colon! One and only one is required after!
Sapermreject :12.0
# Only thresholds may be floating point numbers!
SARewriteBody: 0.0
# This sets the option, with an empty value! Not the way to unset it!
SApermrejectsave:

Later definitions of the same option override earlier ones.

The configuration file determines how sa-exim handles spam: sa-exim can accept messages (returning a 2xx SMTP code), accept and discard messages, temporarily fail messages (returning a 4xx SMTP code), reject messages (returning a 5xx SMTP code), or perform teergrubing during the SMTP connection. For each sa-exim action, you can control at what spam threshold the action is triggered, whether a message that triggered the action should be saved to an archive directory, and the location of the archive directory. sa-exim usually names files in the archive directory by concatenating the time (in seconds since 00:00:00 UTC on January 1, 1970) and the value of the Message-ID header of a given message.

The following sections examine the options in the sa-exim.conf configuration file.

Choosing messages on which to run SpamAssassin

The SAEximRunCond option specifies an Exim conditional expression that will be evaluated to determine whether SpamAssassin should be invoked on a message. To disable SpamAssassin, comment the option out or set its value to 0. To enable SpamAssassin on all messages, set the option's value to 1. The configuration file presents an example of how you can set this variable to check all messages except those originating from the local host or those with an X-SA-Do-Not-Run: Yes header:

SAEximRunCond: ${if and {{def:sender_host_address} {!eq {$sender_host_address}{127.0.
0.1}} {!eq {$h_X-SA-Do-Not-Run:}{Yes}} } {1}{0}}

Choosing messages on which to take antispam actions

The SAEximRejCond option specifies an Exim conditional expression that will be evaluated to determine whether sa-exim should take actions on messages that SpamAssassin considers spam. By disabling the option, you can have messages checked by SpamAssassin (and tagged, if appropriate) but unconditionally accepted. The configuration file provides an example in which actions are taken on all spam messages except those with an X-SA-Do-Not-Rej: Yes header:

# X-SA-Do-Not-Rej should be set as a warn header if mail is sent to postmaster
# and abuse (in the RCPT ACL), this way you're not bouncing spam abuse reports
# sent to you
SAEximRejCond: ${if !eq {$h_X-SA-Do-Not-Rej:}{Yes} {1}{0}}

The X-SA-Do-Not-Run and X-SA-Do-Not-Rej headers can be added by the acl_smtp_rcpt ACL in Exim's own configuration file, using directives such as these:

  warn     message       = X-SA-Do-Not-Run: Yes
           hosts         = +relay_from_hosts

  warn     message       = X-SA-Do-Not-Run: Yes
           authenticated = *

  warn     message       = X-SA-Do-Not-Rej: Yes
           local_parts   = postmaster:abuse

These ACL directives will add X-SA-Do-Not-Run headers to messages from authenticated senders or from hosts from which Exim should relay messages, and will add X-SA-Do-Not-Rej headers to messages to postmaster or abuse. The X-SA-Do-Not-Run header should be removed before messages are relayed to remote hosts; add a headers_remove directive in the definition of the remote_smtp transport:

remote_smtp:
  driver = smtp
  headers_remove = "X-SA-Do-Not-Run"

You may wish to use different header names or values to prevent spammers from guessing your header and adding it to their spam messages to bypass sa-exim.

Limiting how much of the message is fed to SpamAssassin

SAmaxbody determines how many bytes of a message body sa-exim will feed to SpamAssassin for checking; it defaults to 256,000. If SATruncBodyCond evaluates to a false value, messages larger than SAmaxbody are not scanned at all. If SATruncBodyCond evaluates to a true value, such messages are truncated, and the first SAmaxbody bytes are scanned. This is generally not a good idea because proper MIME message formatting requires a closing MIME boundary string at the end of a message, and if SpamAssassin receives a partial body missing this string, it may complain that the message is misformatted.

Allowing SpamAssassin to rewrite message bodies

If you set SpamAssassin's report_safe option to 1 or 2 (asking SpamAssassin to rewrite message bodies), you must set the SARewriteBody variable to 1.

Archiving messages when actions are taken

Archiving message bodies preserves copies of messages in case they are needed later, and archived messages can be used as a quarantine system.

The value of SAmaxarchivebody determines the amount of a message (in bytes) to save when archiving messages after taking action on them. It defaults to 20,971,520 (20MB), which is a reasonable value. Similarly, SAerrmaxarchivebody determines the number of bytes of a message to save when a message causes an error in sa-exim. It defaults to 1,073,741,824 (1GB).

If SAPrependArchiveWithFrom is set to 1, sa-exim will add fake From lines to the beginning of archived messages so that the archive file will be in standard mbox format. This is usually desirable because it's easy to use most mail readers to examine an mbox file.

Passing SMTP senders and recipients to SpamAssassin

Because sa-exim is invoked at the end of the SMTP DATA step, it has access to the list of recipients provided in the SMTP RCPT commands from the sending MTA. If you set SAmaxrcptlength to a value higher than 0, sa-exim adds an X-SA-Exim-Rcpt-To header containing the list of recipients as long as the list doesn't exceed the smaller of SAmaxrcptlength bytes or 8 KB.

sa-exim also has access to the SMTP MAIL FROM command and adds the SMTP sender to the message in the X-SA-Exim-Mail-From header

The recipient list can be useful to SpamAssassin, as messages with a large number of recipients might be more likely to indicate spam, and the true list of recipients may not appear in the message To and Cc headers. Similarly, knowing the SMTP sender might help identify a known spammer or a spammer using an invalid sender address. By setting the SAaddSAEheaderBeforeSA option to 1, you direct sa-exim to add these headers before invoking SpamAssassin on a message, which is the default. Set SAaddSAEheaderBeforeSA to 0 if you prefer SpamAssassin to see messages with no sa-exim headers added.

Warning

Adding the X-SA-Exim-Rcpt-To header will expose recipients who were blind carbon copied (Bcc) and foil other legitimate strategies to keep the list of message recipients private. You should remove this header in your message transports (using the remove_headers directive) before messages are delivered.

If you allow SpamAssassin to rewrite message bodies, however, the headers will be encapsulated in the body of spam messages and cannot be removed. This may be acceptable to you, as these messages are spam anyway, but the privacy risk in the case of a false positive should be considered.

Setting a timeout on spamc

sa-exim must wait for spamc to check messages but should not wait forever. By setting SAtimeout to a value in seconds, you ensure that if spamc should fail to check a message in a reasonable time, the message will be accepted. If you set SAtimeout to 0 (or to more than 300 seconds), Exim itself will interrupt a spamc run after five minutes, but it will cause the SMTP connection to return a temporary failure for the message, instead of accepting it. I recommend that you set SAtimeout and use a value between 60 and 240 seconds.

If a message is accepted due to a spamc timeout, and you set SAtimeoutsave to the absolute path of a directory, the message will be saved in that directory so you can see the impact of your SAtimeout settings. The directory must be writable by the Exim user; if it does not exist, sa-exim will attempt to create it.

You can limit which of these messages are saved by defining SAtimeoutSavCond to an Exim conditional expression. When spamc times out checking a message and the conditional expression returns a true value, the message will be saved. The default SAtimeoutSavCond is 1, which saves all messages when spamc times out.

Handling messages that cause sa-exim errors

Because sa-exim is a robust framework, it considers the possibility that a message might cause an error in sa-exim itself and provides the ability to handle such messages. If a message causes an error, and you set SAerrorsave to the absolute path of a directory, the message will be saved in that directory. The directory must be writable by the Exim user; if it does not exist, sa-exim will attempt to create it.

You can limit which error-causing messages are saved by defining SAerrorSavCond to an Exim conditional expression. If an error occurs and the conditional expression returns a true value, the message will be saved. The default SAerrorSavCond is 1, which saves all messages that cause sa-exim errors.

By default, sa-exim will accept messages that cause errors, which prevents mail loss. An alternative is to have sa-exim instruct Exim to temporarily fail such messages, which will cause the sending MTA to queue them and retry delivery later. To temporarily fail messages that cause errors, set SAtemprejectonerror to 1. Set the SAtemprejectonerror variable to change the message that will be returned to the sending MTA when a message is temporarily failed by setting the SAmsgerror variable.

Teergrubing

If you want sa-exim to perform teergrubing of a connection when spam is detected, set the SAteergrube variable to the SpamAssassin spam score at or above which teergrubing should take place. If you don't define this variable, sa-exim will not teergrube. See the sidebar Teergrubing, earlier in this chapter for an explanation of that technique.

Set the SAteergrubcond variable to an Exim conditional expression to determine whether teergrubing should be performed when the spam score exceeds the SAteergrube threshold; teergrubing will be performed only when the expression evaluates to a true value. Use this variable to prevent teergrubing from affecting you or your secondary mail exchangers. The default sa-exim.conf file includes the following example, which prevents teergrubing of connections from 127.0.0.1 and 127.0.0.2:

SAteergrubecond: ${if and { {!eq {$sender_host_address}{127.0.0.1}} {!eq {$sender_
host_address}{127.0.0.2}} } {1}{0}}

You can configure the teergrube delay—the total amount of time, in seconds, that you want to try to tie up the sending MTA—by setting the SAteergrubetime variable. The default is 900 (15 minutes). Every ten seconds during the teergrubing period, sa-exim will transmit SMTP code 451 with the reason given in SAmsgteergrubewait (which defaults to "wait for more output"). At the end of the teergrubing period, sa-exim will temporarily fail the message with the reason given in SAmsgteergruberej (which defaults to "Please try again later"). sa-exim temporarily fails the messages in the hopes that the sending MTA will later attempt to resend the message and spend more time in the tar pit.

If a message qualifies a connection for teergrubing, and you set SAteergrubesave to the absolute path of a directory, the message will be saved in that directory. The directory must be writable by the Exim user; if it does not exist, sa-exim will attempt to create it.

You can limit which of these messages are saved by defining SAteergrubeSavCond to an Exim conditional expression. If the conditional expression returns a true value, the message will be saved. The default SAteergrubeSavCond is 1, which saves all messages that trigger teergrubing.

Because sa-exim temporarily fails teergrubed mail after the teergrubing period, the sending MTA is likely to resend the same message. If you are saving messages that trigger teergrubing, it could lead to repeatedly saving multiple copies of the same message. To prevent this, set SAteergrubeoverwrite to 1 (which is the default), and sa-exim will use only the message ID as the filename when saving teergrubed messages. Because resends should have the same message ID, this will result in a single copy of the message being kept, as older copies are overwritten by newer copies assigned the same filename.

Accepting and discarding spam

If you want sa-exim to accept and discard spam, set the SAdevnull variable to the SpamAssassin spam score at or above which messages should be accepted and discarded. If you don't define this variable, sa-exim will not take those actions.

If a message is to be discarded, and you set SAdevnullsave to the absolute path of a directory, the message will be saved in that directory. The directory must be writable by the Exim user; if it does not exist, sa-exim will attempt to create it.

You can limit which of these messages are saved by defining SAdevnullSavCond to an Exim conditional expression. If the conditional expression returns a true value, the message will be saved. The default SAdevnullSavCond is 1, which saves all messages that are discarded.

Rejecting spam

If you want sa-exim to reject spam during the SMTP connection, set the SApermreject variable to the SpamAssassin spam score at or above which messages should be rejected. If you don't define this variable, sa-exim will not take this action. You can customize the rejection explanation that is sent along with the SMTP rejection code by setting SAmsgpermreject.

If a message is to be rejected, and you set SApermrejectsave to the absolute path of a directory, the message will be saved in that directory. The directory must be writable by the Exim user; if it does not exist, sa-exim will attempt to create it.

You can limit which of these messages are saved by defining SApermrejectSavCond to an Exim conditional expression. If the conditional expression returns a true value, the message will be saved. The default SApermrejectSavCond is 1, which saves all messages that are rejected.

Temporarily failing spam

If you want sa-exim to temporarily fail spam during the SMTP connection, set the SAtempreject variable to the SpamAssassin spam score at or above which messages should be temporarily failed. If you don't define this variable, sa-exim will not take this action. You can customize the rejection explanation that is sent along with the SMTP rejection code by setting SAmsgtempreject.

If a message is to be temporarily failed, and you set SAtemprejectsave to the absolute path of a directory, the message will be saved in that directory. The directory must be writable by the Exim user; if it does not exist, sa-exim will attempt to create it.

You can limit which of these messages are saved by defining SAtempmrejectSavCond to an Exim conditional expression. If the conditional expression returns a true value, the message will be saved. The default SAtemprejectSavCond is 1, which saves all messages that are temporarily failed.

When sa-exim temporarily fails a message, the sending MTA is likely to resend the same message. If you are saving messages that trigger temporary rejections, this could lead to repeatedly saving multiple copies of the same message. To prevent this, set SAtemprejectoverwrite to 1 (which is the default), and sa-exim will use only the message ID as the filename when saving temporarily failed messages. Because resends should have the same message ID, this will result in single copies of messages being kept, as older copies are overwritten by newer copies assigned the same filename.

There are few good reasons to temporarily fail spam. If you do not want to receive spam at all, permanently reject or accept and discard it instead. If you want to tie up spammer MTAs, teergrube instead. sa-exim includes temporary failing for completeness, but I do not recommend its use.

Archiving accepted spam

When sa-exim receives a message that SpamAssassin tags as spam but that does not meet any of the sa-exim action thresholds, sa-exim will accept the (tagged) message and allow it to be delivered to the recipient.

If a message is to be accepted, and you set SAspamacceptsave to the absolute path of a directory, the message will be saved in that directory. The directory must be writable by the Exim user; if it does not exist, sa-exim will attempt to create it.

You can limit which of these messages are archived by defining SAspamacceptSavCond to an Exim conditional expression. If the conditional expression returns a true value, a message will be archived. The default SAspamacceptSavCond is 0, which does not archive any accepted spam messages.

Although this feature is not useful for end users, mail administrators can use it to help decide whether to lower one of the other action thresholds by examining the saved messages. If there are no false positives, you might lower the action thresholds.

Archiving non-spam messages

When sa-exim receives a message that SpamAssassin does not consider spam, sa-exim will (of course) accept the message and allow it to be delivered to the recipient.

If a non-spam message is received, and you set SAnotspamsave to the absolute path of a directory, the message will be saved in that directory. The directory must be writable by the Exim user; if it does not exist, sa-exim will attempt to create it.

You can limit which of these messages are saved by defining SAnotspamSavCond to an Exim conditional expression. If the conditional expression returns a true value, the message will be saved. The default SAnotspamSavCond is 0, which does not save any accepted non-spam messages.

A mail administrator might use this feature to analyze a group of non-spam messages to determine whether SpamAssassin is making too many false negative judgments, but on a busy mail site, saving extra copies of all legitimate incoming mail is probably not a good idea. sa-exim includes this feature primarily for completeness.

Debugging sa-exim

Set the SAEximDebug variable to a number between 1 and 9 to enable extra logging; higher numbers produce more debugging output. The distributed sa-exim.conf file sets this variable to 1, which will log a notice whenever sa-exim saves a new message to one of its archive directories, invokes spamc, rewrites message bodies, or evaluates an Exim conditional expression. Increasing SAEximDebug is a good idea, particularly when testing new conditional expressions.

Example 8-11 shows a complete sa-exim.conf file (without comments). In this example, sa-exim is configured to reject (but save) messages with spam scores higher than 15.

Example 8-11. A complete sa-exim.conf file

# Run SpamAssassin unless the message was submitted locally or the
# X-SA-Do-Not-Run header is set to 'secret'. We configure Exim elsewhere
# to set this header for messages from authenticated senders or hosts
# we relay for
SAEximRunCond: ${if and {{def:sender_host_address} {!eq {$sender_host_address}{127.0.0.
1}} {!eq {$h_X-SA-Do-Not-Run:}{secret}} } {1}{0}}

# Don't take action on messages if X-SpamAssassin-Do-Not-Rej header is set to
# 'secret'. We configure Exim to set this header for messages to the postmaster.
SAEximRejCond: ${if !eq {$h_X-SA-Do-Not-Rej:}{Yes} {1}{0}}

# Feed up to 300Kb to SpamAssassin, and if the message is longer, don't
# bother spam checkign
SAmaxbody: 307200
SATruncBodyCond: 0

# We don't let SpamAssassin rewrite message bodies, so we don't set this
SARewriteBody: 0

# I prefer to avoid the X-SA-Exim-Rcpt-To header, for privacy reasons.
SAmaxrcptlistlength: 0

# Allow spamc 2 minutes for each message. If it times out, don't bother 
# saving messages, just accept them.
SAtimeout: 120
SAtimeoutsave:
SAtimeoutSavCond: 0

# Do save messages that cause an error in sa-exim, but accept them
SAerrorsave: /var/spool/exim/SAerrorsave
SAerrorSavCond: 1
SAtemprejectonerror: 0

# Reject messages with SpamAssassin scores of 15 or higher, but save a
# copy of them.
SApermreject: 15.0
SApermrejectSavCond: 1
SApermrejectsave: /var/spool/exim/SApermreject

Using Per-User Preferences

Like exiscan, sa-exim checks messages for spam just once—at message receipt after the SMTP DATA command. And like exiscan, it's difficult to use SpamAssassin's per-user preference files with sa-exim. Messages may have multiple recipients, some of whom are not local, and sa-exim will not be able to determine whose preferences should be used.

You can use per-user preferences with sa-exim in the same ways as you can with exiscan, and with the same performance costs:

  • You can ensure that each email message will have only a single recipient by writing an ACL for the SMTP RCPT TO phase that defers all recipients except the first one. The sending MTA will retry delivery to the deferred recipients but may not do so immediately. As a result, some copies of messages with multiple recipients may be significantly delayed.
  • You can use sa-exim to perform initial spam-checking and refuse messages with high scores, and then use the router/transport approach described earlier to reinvoke SpamAssassin on the remaining messages for local recipients. This approach results in an extra spamd connection for each message with a local recipient but might be worthwhile if sa-exim can refuse enough very obvious spam sent to multiple recipients.

Building a Spam-Checking Gateway

Any of the approaches discussed earlier can form the basis for a spam-checking Exim gateway. exiscan or sa-exim will likely yield better performance than a router/transport approach, and I recommend using them unless you need per-user preferences and are prepared to configure spamd to perform SQL-based lookups. The remainder of this chapter explains how to configure an Exim-based gateway for routing messages and how to add sitewide Bayesian filtering and autowhitelisting.

Routing Email Through the Gateway

Once you have Exim receiving messages for the local host and performing SpamAssassin checks on them using any of the methods outlined earlier, you can start accepting email for your domain and routing it to an internal mail server after spam-checking. Figure 8-5 illustrates this topology.

Figure 8-5. Spam-checking gateway topology

Spam-checking gateway topology

Exim domain lists

To configure Exim to relay incoming mail for example.com to internal.example.com, add the following lines to Exim's configuration file:

domainlist local_domains = @
domainlist relay_to_domains = example.com

The key feature of this configuration is that example.com is a domain to which Exim may relay but is not on the list of local domains (for which mail is to be delivered on this host). Remember that you must restart Exim after changing its configuration file.

Routing changes

Mail from the Internet for example.com should be sent to the spam-checking gateway mail.example.com. Add a DNS MX record for the example.com domain that points to mail.example.com.

Once received by mail.example.com, messages will be spam-checked and should then be relayed to internal.example.com by Exim. There are two ways to get Exim to relay these messages:

  • Set up an internal DNS MX record for example.com pointing to internal.example.com. When Exim on mail.example.com attempts to deliver messages for example.com, the dnslookup router will look up this MX record and deliver the messages to the internal mail host. This configuration may require that you run a "split DNS" system or use BIND 9's views feature to ensure that different MX records for example.com are published to the Internet and to internal hosts.
  • Set up a new Exim router using the manualroute driver to manually route incoming messages for example.com to the internal mail host. The router definition, shown in Example 8-12, should be placed in the list of routers before the dnslookup router. In this case, mail.example.com need only be able to resolve internal.example.com (or the IP address for internal.example.com could be substituted for its name in the router definition).

Example 8-12. Using a manualroute router to relay messages

internal_relay:
  driver = manualroute
  domains = example.com
  transport = remote_smtp
  route_list = example.com internal.example.com

Internal server configuration

Once the external mail gateway is in place, you can configure the internal mail server to accept only SMTP connections from the gateway (for incoming Internet mail). If you don't have a separate server for outgoing mail, the internal mail server should also accept SMTP connections from hosts on the internal network. These restrictions are usually enforced by limiting access to TCP port 25 using a host-based firewall or a packet-filtering router.

Adding Sitewide Bayesian Filtering

You can easily add sitewide Bayesian filtering to any of the Exim approaches because they are all based on spamd. Use the usual SpamAssassin use_bayes and bayes_path directives in local.cf, and ensure that spamd has permission to create the databases in the directory named in bayes_path. Use a directory for the databases that is owned by spamd's user, such as /var/spamd(or perhaps use /etc/mail/spamassassin). If local users need access to the databases (e.g., they will be running sa-learn), you may have to make the databases readable or writable by a group other than spamd's and adjust bayes_file_mode. Or you can make the databases world-readable or world-writable. Doing so, however, is unlikely to be necessary on a gateway system and puts the integrity of your spam-checking at the mercy of the good intentions and comprehension of your users.

Adding Sitewide Autowhitelisting

Adding sitewide autowhitelisting is very similar to adding a sitewide Bayesian database. Just add the usual SpamAssassin auto_whitelist_path and auto_whitelist_file_mode directives to local.cf. As with the Bayesian databases, spamd's user must have permission to create the autowhitelist database and read and write to it. In SpamAssassin 2.6x, spamd must be started with the --auto-whitelist option; this option is not needed (and is deprecated) in SpamAssassin 3.0 .

Personal tools