Mailing List Archive: SpamAssassin: devel
TVD_FROM_1 false positive

cedric at gn

Apr 4, 2012, 3:14 AM

This rule has been mentioned here before by flo [at] rfc822 back in 2009,
when it scored a mere 1.0. In the 3.3.1 update channel active.cf has:

##{ TVD_FROM_1
header TVD_FROM_1 From:addr =~
##} TVD_FROM_1
score TVD_FROM_1 2.799 2.799 2.799 2.799

I've noticed it hitting the domain of a concerned user. Of the top of
my head, I can think of other reputable domains ending in at least 1 or
2 digits, and don't personally see 3 digits as an essentially spammy
characteristic (although many domains ending 360 or 365 are indeed
associated with spam or dirty lists).

In my humble opinion:

(a) the high and variable score may be a result of an insufficiently
diverse ham corpus for the rescore mass check. (I'd contribute myself
in a small way but am put off more by the fact that it's time-critical
and don't see any announcements than just the amount of work involved.)

(b) it might be better if rules like this, that presumably hit a large
amount of spam over a short period, were associated with other
characteristics of the same spam as a meta rule. They could be
formulated as subrules or held to a score of at most 0.1, but merely
allowing the scorer to choose between the meta rule and its components
could have a similar effect. This might not just reduce the adverse
effect of potential false positives but also, in the absence of a
description, clarify the intention of the rule or type of spam that it's
aimed at.

What's to be done?

All best wishes,

Cedric Knight

    Re: TVD_FROM_1 false positive KMcGrail at PCCC Apr 4, 2012, 5:12 AM

