Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: SpamAssassin: users

cleanup for DNSBLs

 

 

SpamAssassin users RSS feed   Index | Next | Previous | View Threaded


antispam at khopis

Nov 23, 2009, 4:34 PM

Post #1 of 7 (1384 views)
Permalink
cleanup for DNSBLs

Unless there are objections, I'm going to add two tests to my sandbox:

RCVD_IN_NIX_SPAM, a new (to us) DNSBL populated by the same source as
the original [N]iXhash zone, with results on intra2net that look quite
promising: 72.98:0.12 spam:ham (PSBL has 48.69:0.36),
http://www.intra2net.com/en/support/antispam/blacklist.php_dnsbl=RCVD_IN_NIX_SPAM.html

RCVD_IN_SPAMCOP, a fix-up of SpamCop to limit it to the last external
relay (just like every other DNSBL used by SpamAssassin).

While digging around there, I noticed that SpamCop and ham rule
RCVD_IN_BSP_TRUSTED are the only rules to use check_rbl_txt(), which
affords it a nicer explanation of what triggered the spam. For a
fully apples-to-apples comparison, my fix-up reverts back to plain-old
check_rbl() ... which unfortunately means a second DNS lookup (since
we're looking for an A record rather than a TXT record).

Both will be marked "nopublish" until we have stats to motivate us.


check_rbl_txt() gives quite informative data, and it's supported by
every DNSBL I've tried (all below). RCVD_IN_NIX_SPAM supports it
(though my test will avoid it until we can determine there isn't a bug
in lookups here), as do BRBL and others. Assuming a lack of bugs or
efficiency, we should probably use it for any index that doesn't
contain multiple indices (like zen).

Examples:

$ host -t txt 11.70.132.91.ix.dnsbl.manitu.net.
11.70.132.91.ix.dnsbl.manitu.net descriptive text "Spam sent to the
mailhost mx.selfip.biz was detected by NiX Spam at Mon, 23 Nov 2009
23:31:24 +0100, see
http://www.dnsbl.manitu.net/lookup.php?value=91.132.70.11"
$ host -t txt 11.70.132.91.bb.barracudacentral.org
11.70.132.91.bb.barracudacentral.org descriptive text
"http://www.barracudanetworks.com/reputation/?pr=1&ip=91.132.70.11"
$ host -t txt 11.70.132.91.bl.spamcop.net. Mon 23 19:24:48
11.70.132.91.bl.spamcop.net descriptive text "Blocked - see
http://www.spamcop.net/bl.shtml?91.132.70.11"
$ host -t txt 11.70.132.91.psbl.surriel.com. [1] 19:32:04
11.70.132.91.psbl.surriel.com descriptive text "Listed in PSBL, see
http://psbl.surriel.com/listing?ip=91.132.70.11"
$ host -t txt 11.70.132.91.bl.spameatingmonkey.net.
11.70.132.91.bl.spameatingmonkey.net descriptive text "listed, see
http://spameatingmonkey.com/lookup/91.132.70.11"

(If you're wondering, that IP is listed as the #1 offender by spamcop,
so it hits all of them. 127.0.0.2 gives inaccurate responses since it
is a test and often is called that.)


wtogami at redhat

Nov 23, 2009, 5:10 PM

Post #2 of 7 (1342 views)
Permalink
Re: cleanup for DNSBLs [In reply to]

On 11/23/2009 07:34 PM, Adam Katz wrote:
> Unless there are objections, I'm going to add two tests to my sandbox:
>
> RCVD_IN_NIX_SPAM, a new (to us) DNSBL populated by the same source as
> the original [N]iXhash zone, with results on intra2net that look quite
> promising: 72.98:0.12 spam:ham (PSBL has 48.69:0.36),
> http://www.intra2net.com/en/support/antispam/blacklist.php_dnsbl=RCVD_IN_NIX_SPAM.html
>

Is intra2net measuring the blacklists without lastexternal as their
suggested spamassassin config suggests? If so then their results are
suspect.

Warren


antispam at khopis

Nov 23, 2009, 7:40 PM

Post #3 of 7 (1343 views)
Permalink
Re: [SA] cleanup for DNSBLs [In reply to]

Warren Togami wrote:
> On 11/23/2009 07:34 PM, Adam Katz wrote:
>> Unless there are objections, I'm going to add two tests to my sandbox:
>>
>> RCVD_IN_NIX_SPAM, a new (to us) DNSBL populated by the same source as
>> the original [N]iXhash zone, with results on intra2net that look quite
>> promising: 72.98:0.12 spam:ham (PSBL has 48.69:0.36),
>> http://www.intra2net.com/en/support/antispam/blacklist.php_dnsbl=RCVD_IN_NIX_SPAM.html
>
> Is intra2net measuring the blacklists without lastexternal as their
> suggested spamassassin config suggests? If so then their results are
> suspect.

It's probably similar to http://stats.dnsbl.com/ in that it doesn't do a
good job of its ham numbers but its relative spam counts are fair. Only
one way to find out...
Attachments: signature.asc (0.25 KB)


bjoern.sikora at intra2net

Nov 24, 2009, 1:59 AM

Post #4 of 7 (1327 views)
Permalink
Re: [SA] cleanup for DNSBLs [In reply to]

Hi folks,

> >> Unless there are objections, I'm going to add two tests to my sandbox:
> >>
> >> RCVD_IN_NIX_SPAM, a new (to us) DNSBL populated by the same source as
> >> the original [N]iXhash zone, with results on intra2net that look quite
> >> promising: 72.98:0.12 spam:ham (PSBL has 48.69:0.36),
> >> http://www.intra2net.com/en/support/antispam/blacklist.php_dnsbl=RCVD_IN
> >>_NIX_SPAM.html
> >
> > Is intra2net measuring the blacklists without lastexternal as their
> > suggested spamassassin config suggests? If so then their results are
> > suspect.

yes, you are right, at the moment we show some results with deep header
checks. The stats belong to the corresponding Spamassassin config. If you use
the -lastexternal with a appropriate trust path, the False Positive rate may
decrease.

> It's probably similar to http://stats.dnsbl.com/ in that it doesn't do a
> good job of its ham numbers but its relative spam counts are fair. Only
> one way to find out...

I wouldn't say so ;-). We try to provide proper results for ham measurement
corresponding to the displayed Spamassassin config and of course we are useing
"live" data and update the statistics every monday. Some RBL's are currently
shown with the -lastexternal option, like Spamhaus or CBL as explicitly
recommended on the blacklist webpage.

If you would like us to show the -lastexternal results for all lists,
please let me know.

Regards,

Bjoern

--
Björn Sikora

Intra2net AG
Mömpelgarder Weg 8
72072 Tübingen
Germany

Tel: +49-7071-56510-19
Fax: +49-7071-56510-50
Email: bjoern.sikora [at] intra2net
http://www.intra2net.com

Vorstand: Steffen Jarosch
Aufsichtsratsvorsitzender: Ulrich Emmert
Registergericht: HRB 382270 Stuttgart
USt-IdNr: DE216036710


Intranator Business Server - Jetzt 30 Tage testen
http://www.intra2net.com/de/download/


mysqlstudent at gmail

Apr 17, 2010, 12:30 PM

Post #5 of 7 (1049 views)
Permalink
Re: cleanup for DNSBLs [In reply to]

Hi Adam,

Some time ago you posted that you were investigating the stats and
effectiveness of a few rules in your masschecks sandbox, and thought I
would see if you had made any progress, and found anything helpful?

Posted below...

Thanks,
Alex

On Mon, Nov 23, 2009 at 8:34 PM, Adam Katz <antispam [at] khopis> wrote:
> Unless there are objections, I'm going to add two tests to my sandbox:
>
> RCVD_IN_NIX_SPAM, a new (to us) DNSBL populated by the same source as
> the original [N]iXhash zone, with results on intra2net that look quite
> promising:  72.98:0.12 spam:ham (PSBL has 48.69:0.36),
> http://www.intra2net.com/en/support/antispam/blacklist.php_dnsbl=RCVD_IN_NIX_SPAM.html
>
> RCVD_IN_SPAMCOP, a fix-up of SpamCop to limit it to the last external
> relay (just like every other DNSBL used by SpamAssassin).
>
> While digging around there, I noticed that SpamCop and ham rule
> RCVD_IN_BSP_TRUSTED are the only rules to use check_rbl_txt(), which
> affords it a nicer explanation of what triggered the spam.  For a
> fully apples-to-apples comparison, my fix-up reverts back to plain-old
> check_rbl() ... which unfortunately means a second DNS lookup (since
> we're looking for an A record rather than a TXT record).
>
> Both will be marked "nopublish" until we have stats to motivate us.
>
>
> check_rbl_txt() gives quite informative data, and it's supported by
> every DNSBL I've tried (all below).  RCVD_IN_NIX_SPAM supports it
> (though my test will avoid it until we can determine there isn't a bug
> in lookups here), as do BRBL and others.  Assuming a lack of bugs or
> efficiency, we should probably use it for any index that doesn't
> contain multiple indices (like zen).
>
> Examples:
>
> $ host -t txt 11.70.132.91.ix.dnsbl.manitu.net.
> 11.70.132.91.ix.dnsbl.manitu.net descriptive text "Spam sent to the
> mailhost mx.selfip.biz was detected by NiX Spam at Mon, 23 Nov 2009
> 23:31:24 +0100, see
> http://www.dnsbl.manitu.net/lookup.php?value=91.132.70.11"
> $ host -t txt 11.70.132.91.bb.barracudacentral.org
> 11.70.132.91.bb.barracudacentral.org descriptive text
> "http://www.barracudanetworks.com/reputation/?pr=1&ip=91.132.70.11"
> $ host -t txt 11.70.132.91.bl.spamcop.net.    Mon 23 19:24:48
> 11.70.132.91.bl.spamcop.net descriptive text "Blocked - see
> http://www.spamcop.net/bl.shtml?91.132.70.11"
> $ host -t txt 11.70.132.91.psbl.surriel.com.     [1] 19:32:04
> 11.70.132.91.psbl.surriel.com descriptive text "Listed in PSBL, see
> http://psbl.surriel.com/listing?ip=91.132.70.11"
> $ host -t txt 11.70.132.91.bl.spameatingmonkey.net.
> 11.70.132.91.bl.spameatingmonkey.net descriptive text "listed, see
> http://spameatingmonkey.com/lookup/91.132.70.11"
>
> (If you're wondering, that IP is listed as the #1 offender by spamcop,
> so it hits all of them.  127.0.0.2 gives inaccurate responses since it
> is a test and often is called that.)
>


antispam at khopis

Apr 19, 2010, 8:41 PM

Post #6 of 7 (1025 views)
Permalink
Re: cleanup for DNSBLs [In reply to]

On 04/17/2010 03:30 PM, Alex wrote:
> Some time ago you posted that you were investigating the stats and
> effectiveness of a few rules in your masschecks sandbox, and thought
> I would see if you had made any progress, and found anything
> helpful?

Yeah, analysis (and writing it up) is time-consuming and I was putting
it off. Here it is.

> On Mon, Nov 23, 2009 at 8:34 PM, Adam Katz <antispam [at] khopis> wrote:
>> Unless there are objections, I'm going to add two tests to my sandbox:
>>
>> RCVD_IN_NIX_SPAM, a new (to us) DNSBL populated by the same source as
>> the original [N]iXhash zone, with results on intra2net that look quite
>> promising: 72.98:0.12 spam:ham (PSBL has 48.69:0.36),
>> http://www.intra2net.com/ [...]

DateRev SPAM% HAM% S/O RANK NAME
20091219 6.0855 0.0158 0.997 0.91 T_RCVD_IN_NIX_SPAM
20091226 6.6822 0.0171 0.997 0.91 T_RCVD_IN_NIX_SPAM
20100116 8.8194 0.0079 0.999 0.93 T_RCVD_IN_NIX_SPAM
20100123 9.6367 0.0060 0.999 0.94 T_RCVD_IN_NIX_SPAM

Here are all the results ruleqa was willing to yield. I've removed the
cases where there weren't about a million spams as the data for most
rules is non-representative. After January, ruleqa stopped evaluating
the rule (and RCVD_IN_SPAMCOP) altogether, so I'm not confident in the
results as they never leveled out.

Based on that performance, NiX performs quite well, but not at a level
to justify including in SA proper as it just creates too much DNS traffic.

Jari Fredricksson's recent Top "Ten Rules" post to the list has
RCVD_IN_NIX_SPAM ranked 11th (he posted 20 rules, "Ten" was in the
thread name) with 72.29% spam versus 16% ham at 0.998 S/O (total
ham+spam corpus = 20293). Jari is in NE Europe, like this DNSBL's
spamtrap fodder. My company gets over 17.6% spam on Nix as well.

>> RCVD_IN_SPAMCOP, a fix-up of SpamCop to limit it to the last
>> external relay (just like every other DNSBL used by SpamAssassin).

This again only found four useful trials. The results show that SpamCop
is indeed a well-maintained DNSBL with a very low FP rate, but it
doesn't have the sheer volume of the others.

DateRev SPAM% HAM% S/O RANK NAME
20091219 11.9204 0.0390 0.997 0.89 T_RCVD_IN_SPAMCOP
20091226 10.4777 0.0367 0.997 0.88 T_RCVD_IN_SPAMCOP
20100116 12.2375 0.0953 0.992 0.81 T_RCVD_IN_SPAMCOP
20100123 13.7493 0.0324 0.998 0.90 T_RCVD_IN_SPAMCOP

Compared to the full parsing of headers:

DateRev SPAM% HAM% S/O RANK NAME
20091219 57.4236 1.8637 0.969 0.62 RCVD_IN_BL_SPAMCOP_NET
20091226 57.1671 1.7706 0.970 0.62 RCVD_IN_BL_SPAMCOP_NET
20100116 58.6552 1.7156 0.972 0.62 RCVD_IN_BL_SPAMCOP_NET
20100123 59.0184 1.6012 0.974 0.62 RCVD_IN_BL_SPAMCOP_NET

... it would be a shame to strike spamcop, but it doesn't really seem
like much of a player (because it doesn't use spamtraps). In fact, it's
lack of spamtraps suggests keeping it because it's capable of listing
spammers that successfully avoid spamtraps. Maybe I'll open a bug to
use the lastexternal version instead of the current one.

>> While digging around there, I noticed that SpamCop and ham rule
>> RCVD_IN_BSP_TRUSTED are the only rules to use check_rbl_txt(),
>> which affords it a nicer explanation of what triggered the spam.
>> For a fully apples-to-apples comparison, my fix-up reverts back to
>> plain-old check_rbl() ... which unfortunately means a second DNS
>> lookup (since we're looking for an A record rather than a TXT
>> record).
>>
>> Both will be marked "nopublish" until we have stats to motivate
>> us.
>>
>> check_rbl_txt() gives quite informative data, and it's supported
>> by every DNSBL I've tried (all below). RCVD_IN_NIX_SPAM supports
>> it (though my test will avoid it until we can determine there isn't
>> a bug in lookups here), as do BRBL and others. Assuming a lack of
>> bugs or efficiency, we should probably use it for any index that
>> doesn't contain multiple indices (like zen).

I have no news on this front. That was more meant to be a question to
the other developers. I suppose the TXT data is more verbose and
therefore eats more bandwidth, so therefore SA doesn't use it?
Attachments: signature.asc (0.19 KB)


mysqlstudent at gmail

Apr 20, 2010, 8:27 PM

Post #7 of 7 (1001 views)
Permalink
Re: cleanup for DNSBLs [In reply to]

Hi Adam,

>> Some time ago you posted that you were investigating the stats and
>> effectiveness of a few rules in your masschecks sandbox, and thought
>> I would see if you had made any progress, and found anything
>> helpful?
>
> Yeah, analysis (and writing it up) is time-consuming and I was putting
> it off.  Here it is.

Thanks for the info. Hope to see further analysis of your efforts in the future.

Best,
Alex

SpamAssassin users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.