SpamAssassin

 

Doug Ledbetter::Technology::SpamAssassin

The following are the essential steps I took to install/configure SpamAssassin v2.6 to work with qmail + Vpopmail + Procmail and use an SQL database to store the individual SpamAssassin settings. I've created this page because I could find no help when I wanted to set up this particular combination.

First, I should mention that I wanted maximum control over how SpamAssassin works for individual mailboxes. I wanted to be able to turn it on/off individually and configure SpamAssassin settings (ie- Razor, hits, Bayes DB location, etc.) individually for each mailbox and I wanted to store all this in an SQL database for ease of use.

1. Obviously you need SpamAssassin installed. :) I'm using version 2.60. NOTE: This guide may not work with newer versions of SpamAssassin.

I had to make a small change to one of the SpamAssassin scripts because normally it won't read certain settings from a database for security reasons. Notice I have modified the source of SpamAssassin! Please use at your own risk! As long as access to your database is secure, you shouldn't have any problems.

     /usr/lib/perl5/site_perl/5.6.1/Mail/SpamAssassin/Conf.pm

Around line 280:

     sub parse_scores_only {
     my ($self) = @_;
     # $self->_parse ($_[1], 1); # don't copy $rules!
     $self->_parse ($_[1], $disallow_rules_in_sql); # allow copying of rules -dougl
     }

Where:

     my $disallow_rules_in_sql = 0;


2. Here's how I run the "spamd" daemon:

     spamd -d -a -v -x -q -u vpopmail -H /home/vpopmail/

2a. OPTIONAL: Install Razor. Run 'razor-admin -create' & ' razor-admin -register' as the vpopmail user. This creates a .razor directory in /home/vpopmail. Don't forget to patch Razor if you're using SpamAssassin 2.60 (patch is included with SpamAssasin v2.60).

NOTE: I had an email from Eric with the following information:

One note though, in following your directions, I couldn't get the command "razor-admin -register" to work. I kept getting the error:

Error 202 while performing register, aborting.

I found http://www.mail-archive.com/razor-users@lists.sourceforge.net/msg00980.html, and it stated to run "razor-admin -discover" before "razor-admin -register" and all worked fine. You might want to update your page to reflect this little hiccup.

3. Make sure you have a recent version of Procmail installed.

4. Create a .procmailrc file and put it in the mailbox directory (ie- /home/vpopmail/domains/<domain>/<user>/.procmailrc

Contents of .procmailrc:

    # Note: paths in this file are relative to the email domain directory.
    # ie- /home/vpopmail/domains/<domain>/
    #
    # The following 4 lines do verbose procmail logging useful for debugging.
    # Comment them out before receiving a lot of email. ;)
    #
    LOGFILE=./pm.log
    LOG="
    "
    VERBOSE=yes
    MAILBOX="./<user>/Maildir/"
    SPAMBOX="./spam/Maildir/" # optional
    USER="<full email address of mailbox>"
    # SpamAssassin filter
    :0fw
    | /usr/bin/spamc -d 127.0.0.1 -u ${USER} -t 3
    :0e
    {
    EXITCODE=$?
    }
    # If you want procmail to put spam messages into another mailbox, 
    # uncomment these next lines and be sure that "SPAMBOX" is a valid 
    # mailbox
    #
    # Toss Spam into another maildir
    #:0
    #* ^X-Spam-Flag:.YES
    #${SPAMBOX}
    # Deliver the mail to the Maildir
    ${MAILBOX}

NOTE: The above .procmailrc has Procmail doing the mail delivery and NOT vdelivermail which means you won't get email quota support. If you want to use vdelivermail, it's a little bit more tricky.

5. Edit the .qmail-<user> file for the mailbox on which you want to set up SpamAssassin:

| preline /usr/bin/procmail -t ./<user>/.procmailrc


Known problem: SpamAssassin will not process email sent to a catch-all account. Ie- you have a .qmail-default in the vpopmail domain directory that says something like this:

     | /home/vpopmail/bin/vdelivermail '' <some email address>

The next version of Vpopmail should fix this problem.

6. Create a MySQL database according to the SpamAssassin docs. Then you can insert records like this (sorry about the long lines):

mysql> select * from userpref;
+-----+--------------------------+---------------------+---------------------------------------------------------+----------------+
| oid | username | preference | value | updated |
+-----+--------------------------+---------------------+---------------------------------------------------------+----------------+
| 1 | <full email address> | use_bayes | 1 | 20030924164532 |
| 6 | <full email address> | bayes_path | /home/vpopmail/domains/<domain>/.spamassassin/bayes | 20030924164532 |
| 4 | <full email address> | report_safe | 0 | 20030924164532 |
| 7 | <full email address> | auto_whitelist_path | /home/vpopmail/domains/<domain>/.spamassassin/auto-whitelist | 20030924164532 |
| 9 | <full email address> | auto_learn | 1 | 20030924164532 |
| 10 | <full email address> | rewrite_subject | 1 | 20030924164532 |
| 11 | <full email address> | required_hits | 5.0 | 20030924164532 |
| 12 | <full email address> | use_razor2 | 0 | 20030924164532 |
+-----+--------------------------+---------------------+---------------------------------------------------------+----------------+
8 rows in set (0.02 sec)

Each user (ie- full email address) can have it's own settings in this table. This makes it really easy to turn on/off every SpamAssassin option on an individual basis.

Note: Currently I'm using one Bayes database per email domain, but I could change that to use one database per mailbox or one database for the entire system right in the SQL table.

These above settings will mark email with the "X-Spam-Flag:" header and you can configure the email client to filter messages unless you un-comment the lines in the .procmailrc file to direct spam messages to a different mailbox.

sa-learn Note: At this time, sa-learn doesn't work well with this configuration. As a work-around, I am limiting sa-learn to a web interface only. I have written a wrapper script that writes a temporary global SpamAssassin config file to trick sa-learn into using the correct Bayes database. The wrapper script then calls sa-learn like this:

        sa-learn -C /tmp/sa-learn_$random_number/ --$mode /tmp/sa-learn_$random_number/$mode.txt

Where $random_number is a randomly generated number and $mode is either "ham" or "spam". The $mode.txt file contains the email message to process. If you need more help with this, please let me know.

I think that's most of it. I know this description was unorganized and sketchy, please feel free to ask questions. Email me at dougl (at) dougledbetter.org