SpamAssassin/Using SpamAssassin as a Proxy
In some environments, it makes no sense to install SpamAssassin on the mail server. For example, the mail server may be underpowered to perform content-checking. Or perhaps users have widely ranging preferences for how much (or indeed whether) spam-checking should be performed, and they may not have accounts on the mail server or any convenient way of configuring their preferences. In these environments, one way to provide those users who want the power of SpamAssassin with spam-checking is to help them install a SpamAssassin POP proxy.
Many more POP proxies are available than IMAP proxies, primarily because IMAP is a much more complex protocol and doesn't require that messages be downloaded to the client. At the time of writing, no freely distributed SpamAssassin IMAP proxies for Windows clients were available.
In addition, most extant proxies call SpamAssassin through the Perl API to avoid having to run the spamassassin shell script or a persistent spamd daemon. Because the Perl API will change in SpamAssassin 3.0, proxies written for SpamAssassin 2.63 are unlikely to continue to work until they are upgraded.
Proxy software is middleware. A proxy receives connections from a client and relays them to a server, intercepting all communication in each direction. Application proxies have been used to pierce smart holes in strong firewalls, to cache frequently accessed data, and to perform a variety of other functions.
Logically, POP proxies sit between a mail client and a POP server. Actually, these proxies typically run on the same computer as the mail client. The proxies discussed in this chapter not only relay data (email messages) between the client and server, but also invoke SpamAssassin to perform spam-checking on the email after it has been received from the server but before it is relayed to the client. Users continue to use their favorite POP client; no changes need be made at the POP server.
In this chapter, I review two SpamAssassin proxies. The first is the venerable Pop3proxy, a freely distributed command-line proxy script written in Perl and suitable for use on several operating systems. The second is the commercial proxy SAproxy Pro from Stata Labs.
POP proxies do not offer the complete functionality of POP servers; in particular, they may be limited in how they can perform authentication and secure the transaction. Using a POP proxy may result in sending your email password across the Internet in the clear.
Figure 9-1 illustrates the example topology for this chapter. pop.example.com is a POP mail server. win.example.com is a Windows-based user workstation that runs a POP mail client (e.g., Outlook Express, Eudora, Netscape Messenger). The SpamAssassin POP proxy will be installed on win.example.com, and the mail client will be configured to connect to the proxy rather than to the POP server. The proxy will be configured to connect to the POP server and to run SpamAssassin on messages as they are downloaded.
Pop3proxy, by Dan McDonald, is one of the oldest SpamAssassin POP proxies and, to its credit, still functions well with SpamAssassin 2.63. It's a no-fills proxy written in Perl and requires manual installation and configuration. It does not perform network-based SpamAssassin tests. Download Pop3proxy (and read the manual) at http://mcd.perlmonk.org.
Follow these steps to install Pop3proxy:
- Download pop3proxy.zip and unpack it into a directory of your choice. For this example, I assume you've unpacked it in C:\pop3proxy so a directory listing of that directory would look like this:
C:\pop3proxy>dir /s Directory of C:\pop3proxy 03/26/2004 09:56p <DIR> . 03/26/2004 09:56p <DIR> .. 03/26/2004 09:56p <DIR> pop3proxy 08/18/2002 05:40p 60,781 pop3proxy.pl 08/18/2002 05:40p 28,798 pop3proxy.html 2 File(s) 89,579 bytes Directory of C:\pop3proxy\pop3proxy 03/26/2004 09:56p <DIR> . 03/26/2004 09:56p <DIR> .. 08/12/2002 08:19a 6,240 Artistic 08/11/2002 08:45p 567 kill_proxy.pl 08/12/2002 08:30a 536 hostmap.sam 3 File(s) 7,343 bytes
- Install a version of Perl for Windows that includes the Time::HiRes module. Several Perl distributions for Windows are available, but one that is known to work (and provides a precompiled version of the module) is ActivePerl, available at http://www.activestate.com/Products/ActivePerl. Either ActivePerl 5.6.1 or 5.8.3 works well with Pop3proxy. ActivePerl 5.8.3 supports Unicode. ActivePerl 5.6.1 does not support Unicode but has been extensively tested with SpamAssassin. In this example, I assume you've installed ActivePerl in C:\perl.
Time::Hires can be installed through ActivePerl's Perl Package Manager. After installing ActivePerl, run the Perl Package Manager, and type install Time::HiRes at the ppm> prompt. Type quit to exit the Package Manager.
- Download and unpack SpamAssassin. Copy all of the files and directories in SpamAssassin's lib directory to ActivePerl's C:\perl\site\lib directory. Copy SpamAssassin's rules directory and all its contents to C:\pop3proxy\rules. Copy the user_prefs.template file from the rules directory to C:\pop3proxy and rename it user_prefs. The C:\pop3proxy directory should now look like this:
C:\pop3proxy>dir /w /s Directory of C:\pop3proxy [.] [..] [pop3proxy] pop3proxy.html pop3proxy.log pop3proxy.pl [rules] user_prefs 4 File(s) 110,825 bytes Directory of c:\pop3proxy\pop3proxy [.] [..] Artistic hostmap.sam kill_proxy.pl 3 File(s) 7,343 bytes Directory of C:\pop3proxy\rules [.] [..] 10_misc.cf 20_anti_ratware.cf 20_body_tests.cf 20_compensate.cf 20_dnsbl_tests.cf 20_fake_helo_tests.cf 20_head_tests.cf 20_html_tests.cf 20_meta_tests.cf 20_phrases.cf 20_porn.cf 20_ratware.cf 20_uri_tests.cf 23_bayes.cf 25_body_tests_es.cf 25_body_tests_pl.cf 25_head_tests_es.cf 25_head_tests_pl.cf 30_text_de.cf 30_text_es.cf 30_text_fr.cf 30_text_it.cf 30_text_pl.cf 30_text_sk.cf 50_scores.cf 60_whitelist.cf languages local.cf regression_tests.cf STATISTICS-set1.txt STATISTICS-set2.txt STATISTICS-set3.txt STATISTICS.txt triplets.txt user_prefs.template
To start Pop3proxy, you must invoke Perl on the pop3proxy.pl script and provide command-line arguments to identify the POP server. If you allowed ActivePerl to associate its perl.exe program with .pl file extensions, you should be able to execute pop3proxy.pl directly. Otherwise, set up a shortcut or batch file containing:
c:\perl\bin\perl c:\pop3proxy\pop3proxy.pl --host pop.example.com
When invoked, the shortcut will open a DOS window and execute the proxy script. You can stop the proxy by typing CTRL-C in the DOS window. When you've confirmed that it's working as you like, you can replace \perl\bin\perl with \perl\bin\wperl in the shortcut. wperl runs the script in the background (without opening a DOS window); use it when you plan to keep Pop3proxy running all the time. You can stop the proxy by invoking the kill_proxy.pl script included with Pop3proxy, or by using the Windows Task Manager to kill the wperl process.
Here is a complete list of Pop3proxy's command-line arguments:
- --host hostname[:port]
- Provide the hostname (and optionally, the port number) of the remote POP server to proxy.
- --logfile filename
- Provide the name of a file to log connection and status information to. This defaults to pop3proxy.log. The log file can be useful in debugging problems with Pop3proxy. Example 9-1 shows a Pop3proxy log of a successful connection in which Pop3proxy downloaded two messages and classified one as spam.
- --maxscan bytes
- Specify the largest message, in bytes, that Pop3proxy will invoke SpamAssassin on. The default is 250,000, which is reasonable. Larger sizes cause more messages to be scanned, but larger messages scan more slowly.
- POP servers sometimes provide POP clients with message- and mailbox-size information. Running SpamAssassin on a message when it's between server and client can change (typically, enlarge) the message size. Most modern clients handle this situation with no problems, but if yours does not, the --nopad option causes Pop3proxy to overwrite text in existing headers rather than adding new ones, maintaining a constant size at the cost of obfuscating the message headers to a small degree.
- The POP protocol provides a TOP command that the client uses to request only a limited amount of a message from the server (deferring the retrieval of the rest of the message until it's explicitly asked for). TOP doesn't interact well with spam-checking proxies, and Pop3proxy normally prevents the client from using it. If you want to try it out anyway, use the --allowtop argument.
- --exitport portnumber
- The kill_proxy.pl script works by connecting to a second port that pop3proxy.pl listens on. Any connections on this port cause pop3proxy.pl to exit. By default, the port number is 9625, but you can use the --exitport option to change it if you use that port number for something else. If you change the port number, you must edit the kill_proxy.pl script and change the value of $exitport near the beginning of the file.
Example 9-1. A log from Pop3proxy
New connection: From: 127.0.0.1:2094 To: 192.168.0.4:110 192.168.0.4:110 (Server) said +OK to none +OK POP3 Ready pop.example.com 127.0.0.1:2094 (Client) said CAPA 192.168.0.4:110 (Server) said -ERR to CAPA 127.0.0.1:2094 (Client) said USER 192.168.0.4:110 (Server) said +OK to USER 127.0.0.1:2094 (Client) said PASS 192.168.0.4:110 (Server) said +OK to PASS 127.0.0.1:2094 (Client) said STAT 192.168.0.4:110 (Server) said +OK to STAT 127.0.0.1:2094 (Client) said UIDL 192.168.0.4:110 (Server) said +OK to UIDL 127.0.0.1:2094 (Client) said LIST 192.168.0.4:110 (Server) said +OK to LIST 127.0.0.1:2094 (Client) said RETR 192.168.0.4:110 (Server) said +OK to RETR Snarfing RETR response Detected end of snarfed multiline 35510 bytes, SPAM, Message-id: <200402062012.i16KCGBb015885@example.com> 127.0.0.1:2094 (Client) said LIST 192.168.0.4:110 (Server) said +OK to LIST 127.0.0.1:2094 (Client) said RETR 192.168.0.4:110 (Server) said +OK to RETR Snarfing RETR response Detected end of snarfed multiline 2096 bytes, NOT spam, Message-id: <email@example.com> 127.0.0.1:2094 (Client) said QUIT 192.168.0.4:110 (Server) said +OK to QUIT 192.168.0.4:110 - socket close on read Flushing peer on close 127.0.0.1:2094 - peer gone after write, closing
Pop3proxy can proxy multiple POP servers through the use of a hostmap file. See the Pop3proxy manual for more information about setting up such a file.
Configuring the POP Client
Finally, you must reconfigure a mail client to connect to localhost (or 127.0.0.1) instead of the usual POP server. Connections to localhost will be received by Pop3proxy and proxied to the POP server. Figure 9-2 shows the Eudora 5.1 dialog box for configuring the incoming POP server for an account.
Using SAproxy Pro
SAproxy Pro, by Stata Labs, began as a commercialized version of Pop3proxy, but it has been extensively developed with a focus on ease-of-use and access to both SpamAssassin and POP client features. It's available for Windows operating systems. At the time of this writing, the latest version is 2.5 and is selling for $29.95, which includes a year of free upgrades. A free 15-day trial is available. You can download the trial version or purchase the product at http://www.statalabs.com/products/saproxy/.
SAproxy Pro 2.5 includes its own Perl library and SpamAssassin 2.63, so you don't have to install either of those products separately. When SpamAssassin 3.0 is released, a future version of SAproxy Pro is likely to distribute SpamAssassin 3.0 instead, and upgrading should be relatively simple.
Installing SAproxy Pro
SAproxy Pro uses a downloadable InstallShield installer. Once you've downloaded the installer, run it. You'll be prompted to select your mail client; for several of the most widely used mail clients, the installer includes a training video that demonstrates how to configure mail accounts to use the proxy. You may be required to reboot your computer to finish the installation.
Starting SAproxy Pro
After SAproxy Pro is installed, you can start it manually from the Windows Start menu. The installer also offers to configure SAproxy Pro to start automatically on system startup. When SAproxy Pro is running, a system tray icon will appear; right-clicking this icon brings up a menu for configuring or shutting down SAproxy Pro.
Configuring the POP Client
Configuring a POP client to use SAproxy Pro is straightforward. Set the incoming POP server to localhost or 127.0.0.1. Set the POP login name to your usual POP account name, followed by a colon and the hostname of the remote POP server. Figure 9-3 shows an example of this configuration in Microsoft Outlook Express 6. In the example, Outlook Express will connect to 127.0.0.1 and log in as alansz:pop.example.com. SAproxy Pro will accept the connection and will proxy for the POP server pop.example.com, using alansz as the login name.
Stata Labs also distributes an SSL plug-in module for SAproxy Pro at http://www.statalabs.com/products/saproxy/ssl/. If your POP server supports SSL connections, install the plug-in and add :ssl to the end of the POP login name (e.g., alansz:pop.example.com:ssl ) to direct SAproxy Pro to make an SSL connection to the POP server.
Configuring SAproxy Pro
SAproxy Pro really shines in its configuration interface, which is available by double-clicking the SAproxy Pro system tray icon, or by right-clicking the icon and selecting Configure SAproxy Pro. Most of the configuration options are the same as those available in Pop3proxy, and through SpamAssassin's preference files. The graphical interface makes them much easier for inexperienced users to select. The Configuration dialog box is divided into nine tabs:
- Always Spam
- Use this section to add blacklist entries. You can blacklist messages by sender email address, sender domain, or keyword.
- Never Spam
- Use this section to add whitelist entries. You can whitelist messages by sender email address or sender domain.
- Spam Training
- Use this section to enable use of SpamAssassin's Bayesian classifier (including auto-learning). SAproxy Pro can also manually scan email folders that you specify as containing spam or non-spam messages in order to train the classifier.
- Language Settings
- Use this section to limit the set of languages that you expect to receive email in; email in other languages will be treated as spam.
- Safety Settings
- Use this section to configure the report_safe SpamAssassin directive.
- Tagging Options
- Use this section to turn on subject-tagging for spam messages and to set the SpamAssassin threshold score for spam. SAproxy Pro allows thresholds between 3.5 and 6.5.
- Advanced Settings
- Use this section to turn on such options as logging, proxying of the TOP and AUTH commands, and the use of SpamAssassin's network tests. DCC, Pyzor, Vipul's Razor, and DNSBL tests are available, and you can turn each on or off independently. You can use the Host Map part of this section to configure SAproxy Pro to listen on different local ports to proxy connections to different remote POP servers.
- This section displays a line graph comparing the number of spam and non-spam messages received each day of the last month.
- This section provides links to SAproxy Pro help.