5/14 SpamAssassin's false negative rate has soared in the last week or so.
What's going on? Don't say SpamAssassin sucks, because I already know
that. But usually it gets updated regularly so that it can at least
keep its head above water. Is root no longer updating it?
\_ ifile
\_ are you really getting false negatives, or just emails not
checked by spamAssassin because spamd keeps falling over?
\_ I'm getting false negatives. I switched to using
spamassassin directly yesterday. - !OP
\_ spamassassin is actively subverted by spammers.
because of its relatively fixed ruleset, this is
trivial for them to do. use ifile.
\_ spamass has bayesian filtering now. it works
pretty well for me --aaron
\_ pointers on setting bayesian filter up? thanks
\_ If it was trivial, more would do it. I just
counted, of the last 100 spam sent to me,
97 were blocked.
\_ "Were" not "was", and it _is_ trivial. And I've
run at 98% accuracy with ifile for months (until
i went over quota and my data file got nuked. Need
to talk with the dev team about that). --scotsman
\_ Okay you've convinced me, what is the
quickest way for me to switch from
using sa to ifile in my .procmailrc?
\_ My .procmailrc is readable. The scripts
used are in ~scotsman/bin. I can give
you a large ball of spam to run through
ifile.learn.mailbox as a seed. I use mutt
hooks to retrain messages. my .muttrc is
also readable. but basically, you just
run a mailbox of good mail through
"ifile.learn.mailbox good mailbox"
and a spam ball through
"ifile.learn.mailbox spam mailbox"
the run mail through ifile.inject.header
(or ifile.inject-learn.header to continue
reeducation) --scotsman
\_ Thanks, I'll take a look.
\_ Are you sure you are just not getting more spam? I have been
getting a consistent 2-5% false negative with SA from the
start. -ausman
\_ testing ifile on my spam spool and known good mail spool,
it seems that ifile would have tagged 10% of my spam spool
as good and 0.4% of my good mail as spam. since i don't
have separate spools of spamassassin misfiles, i am not
sure how it compiles. subjectively it seems ifile is more
sure how it compares. subjectively it seems ifile is more
porous than spamassasin on my mail. more training required?
\_ This is after training it on those spools?
\_ yep, trained on the spools, then tested on the same
spools. the spam spool is ~15MB, the good one ~7MB.
\_ This can be explained away, because it learns the
mailboxes one message at a time. The end data set
after learning 1000 messages can skew how earlier
messages in that group might be filed. but, moving
forward, the percentages should be much better. I'm
glad to see your false positive number was much lower
than your false negatives... I should also have prefaced
my percentages by saying that first I ran a 3000+ message
spamspool i got from dbushong. --scotsman
\_ My experience with ifile is that it works
significantly less well than spamassassin. It
depends on what kind of spam you get, but I've
switched on one of my two primary spam-reception
accounts, and that one definitely has fewer
spam-categorization errors than the other. -tom |