| ||||||
| 5/30 |
| 2004/7/2-4 [Computer/SW/SpamAssassin] UID:31126 Activity:high |
7/2 I notice that in my .spamassassin directory, there is a 5
Meg file called bayes_toks. I _do_ understand why it is there.
But, there is another 5 Meg file called bayes_toks.expire84232.
And there are 4 other bayes_toks.expire files of varying
size. Why are they there? Can I delete them? My spamassassin dir
takes up nearly 13 Megs. I gather that the expire files can be
deleted. I believe that I hit my hard quota while spamc was
running, leaving these orphaned files. Assuming this to be the
case, how do I stop spamc from auto-learning?
\_ So, I noticed that in my messages that were classified as spam,
it said "autolearn=no", but in those classified as ham, it said
"autolearn=ham", so I thought that the expire files are created
while it's trying to auto-learn. But in fact, it seems that it's
when I receive spam (at least in one case) that it creates the
expire files. What is it doing when it is creating
bayes_toks.expire files and how do I get it to stop? I just
want it to filter my mail. Why should it create any files? I
can sa-learn it on my own time. -op
\_ autolearn=no means spamassassin doesn't know whether it is spam or
ham. You need to train spamassasin manually with that message.
According to the global setting in
/usr/local/share/spamassassin/10_misc.cf
Any mail with score > 12 is learnt as spam.
Any mail with score < 0.1 is learnt as ham.
\_ http://csua.org/u/80t
\_ this is a good example of why url shortening can be bad.
\_ I don't think it's terribly bad, but if it makes you feel
better, I've changed the result page for shortcutting so it
shows Title: link for easy copy-and-pasting of the whole
thing. It won't necessarily end up shorter than the original
that way, but it will possibly be more informative. --dbushong
that way, but it will possibly be more informative.
--dbushong |
| 5/30 |
|
| csua.org/u/80t -> lists.roaringpenguin.com/pipermail/mimedefang/2004-January/019591.html A few days ago I reported a problem I was having with my bayes database to the SATalk mailing list along with the observation that I was pretty sure it was a bug in the bayes expiry software. expire4752 A few days later I got a message from another person, David Lee, who had run into the same problem and who thought it might be due to the controlling agent, in his case a program called Mailscanner, timing out the expiry process before it could complete. It turns out that is exactly what was happening (I think). Bayes expiry can often take 3 or 4 minutes to complete, and if the system load happens to be really high when a mimedefang/spamassassin process decides its time to do an expiry, the process can easily take much longer, and if it takes longer than 5 minutes your're in trouble, since AFAIK the sendmail default timeout on a milter operation is 5 minutes. If a bayes expiry takes longer than 5 minutes it will be abruptly terminated? I'm also pretty sure this must be the case because I copied the files to another location for testing and ran an expire via sa-learn and it finished successfully in about 8 minutes, so it wasn't a matter of a corrupted database causing the problem. cf file and use sa-learn to force an expire on a regular basis via cron. As I recall someone in this forum suggested such an approach in a previous posting, but never gave a reason, so it didn't occur to me that it was mandatory and not just a matter of personal preference. I'll also be reporting this to the SATalk mailing list along with the observation that bayes expiry takes much too long, and the code could use some work to improve performance. |