Berkeley CSUA MOTD:Entry 34402
Berkeley CSUA MOTD
 
WIKI | FAQ | Tech FAQ
http://csua.com/feed/
2024/12/24 [General] UID:1000 Activity:popular
12/24   

2004/10/28 [Computer/SW/Languages] UID:34402 Activity:moderate
10/28   Gah! HOw do I keep spamassassin's files from putting me over
        quota? I have STFW. I spent a couple hours, actually. I lack
        sufficient clue to find the answer.
        \_ Add a 'rm -f LSPAM' line to your .login.
           \_ What does that do? I don't know of an "LSPAM" file.
              \_ Um, if you never see LSPAM files, never mind.
           \_ Does this mean someone else is using ifile?
              \_ Guilty as charged.
                 \_ Wow.
        \_ Link the following files in ~/.spamassassin to /dev/null
           (using ln -s):
                auto-whitelist.db@ -> /dev/null
                bayes_journal@ -> /dev/null
                bayes_seen@ -> /dev/null
                bayes_toks@ -> /dev/null
           \_ Hmm. Maybe I'm asking too much, but is there a way to stay
              under quota that doesn't involve crippling bayesian filtering?
              I don't get why I'd be the only person whose bayes_* files are
              going over quota. I just haven't heard what people are doing
              about it given sa's populariry.
              \_ You aren't the only one. I asked for a quota increase.
                 It's not like it's a lot of space.
              \_ Google for Mail::SpamAssassin::Conf, and look at the
                 following configuration settings:
                 bayes_journal_max_size
                 bayes_expiry_max_db_size
                 bayes_auto_expire
                 bayes_learn_to_journal
                \_ Alrighty. The trouble is answers web-wide are good for the
                   quotas on particular systems. I would like some suggested
                   values for these items (which all go into .spamassassin/
                   user_prefs for others trying to learn from this thread) that
                   are good for soda.
        \_ I have all these files like bayes_toks.expire7965 that I am pretty
           sure I did not used to have. What is going on?
           \_ http://csua.org/u/80t
Cache (1650 bytes)
csua.org/u/80t -> lists.roaringpenguin.com/pipermail/mimedefang/2004-January/019591.html
A few days ago I reported a problem I was having with my bayes database to the SATalk mailing list along with the observation that I was pretty sure it was a bug in the bayes expiry software. expire4752 A few days later I got a message from another person, David Lee, who had run into the same problem and who thought it might be due to the controlling agent, in his case a program called Mailscanner, timing out the expiry process before it could complete. It turns out that is exactly what was happening (I think). Bayes expiry can often take 3 or 4 minutes to complete, and if the system load happens to be really high when a mimedefang/spamassassin process decides its time to do an expiry, the process can easily take much longer, and if it takes longer than 5 minutes your're in trouble, since AFAIK the sendmail default timeout on a milter operation is 5 minutes. If a bayes expiry takes longer than 5 minutes it will be abruptly terminated? I'm also pretty sure this must be the case because I copied the files to another location for testing and ran an expire via sa-learn and it finished successfully in about 8 minutes, so it wasn't a matter of a corrupted database causing the problem. cf file and use sa-learn to force an expire on a regular basis via cron. As I recall someone in this forum suggested such an approach in a previous posting, but never gave a reason, so it didn't occur to me that it was mandatory and not just a matter of personal preference. I'll also be reporting this to the SATalk mailing list along with the observation that bayes expiry takes much too long, and the code could use some work to improve performance.