YASC: Yet Another Spam Chart #

SA Score HistogramGrabriel Radic made the observation that it is indeed possible to make Mail.app differentiate between messages based on SpamAssassin score (in reference to my previous entry bemoaning the need for better filtering). Specifically, SA adds a X-Spam-Level header, whose contents are the message's score represented as asterisks (one point = one asterisk). By creating a rule that filters on this header (the "Edit Header List..." command allows filtering based on custom headers), it is possible to do things per score (e.g. color messages differently, or even delete the outright).

The latter possibility interests me the most at this time, since anything that reduces the number of messages I have to deal with helps (though this would still be a stop-gap solution until I have time to implement my challege-response solution). The question then becomes, what should this threshold be set to? I hacked up a Perl script that uses Email::Folder to parse my Junk mbox and generate (via Excel) a histogram of spam scores. The figure shows the results, with the quartiles (roughly) being highlighted. For starters, I have decided to set the threshold at 11, which (given this week's distribution) would remove about half the messages outright. This would still leave me with about 8,000 messages per week, so I may have to be even more stringent.

Post a Comment