Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: SpamAssassin: users

consolidating DNSBLs into a single query (was Spam Eating Monkey?)

 

 

SpamAssassin users RSS feed   Index | Next | Previous | View Threaded


rob at invaluement

Oct 6, 2009, 9:19 PM

Post #1 of 4 (487 views)
Permalink
consolidating DNSBLs into a single query (was Spam Eating Monkey?)

Warren Togami wrote:
> You are misunderstanding the question. A single DNS query could
> respond different numbers meaning they are hits on different lists.
> Your lists that are subsets or supersets of other lists can easily use
> this. The querying software need only to know what each result means.

Not saying that this is a bad idea, but it does have its limitations.
For example, some lists are into the hundreds of megabytes large, and
getting the whole file rsncned and updated can take more than several
minutes. Often, such lists update only once or twice per hour, if even
that often.

In contrast, some lists are smaller and faster reacting and update every
few minutes.

Trying to merge all such lists into a single lists every several minutes
is no trivial task in terms of having enough CPU cycles and RAM to get
that done correctly and within a reasonably short time.

Likewise, doing the merge hourly loses the benefit of some of the
smaller-footprint faster-reacting lists which can react to emerging spam
threats faster.

Not saying such a consolidation can't be done... and maybe a few
tradeoffs here are worthwhile? But if these issues are not dealt with
smartly and competently, then one could easily find themselves with that
all-in-one comprehensive DNSBL has not being as effective as querying
them separately.

Also, this loses the ability to *score* on multiple lists... unless you
use a bitmasked scoring system whereby one list gets assigned ".2",
another ".4", another ".8", on to ".128". But that leaves a maximum of
only 7 lists. Sure, you can add more than 7 by employing other octets in
the "answer IP", but that only severely complicates matters.

And as it stands, you'd also have the complexity of getting the spam
filter to parse, understand, and react properly to those bitmasks.

--
Rob McEwen
http://dnsbl.invaluement.com/
rob [at] invaluement
+1 (478) 475-9032


royce.williams at gmail

Oct 6, 2009, 10:42 PM

Post #2 of 4 (425 views)
Permalink
Re: consolidating DNSBLs into a single query (was Spam Eating Monkey?) [In reply to]

On Tue, Oct 6, 2009 at 8:19 PM, Rob McEwen <rob [at] invaluement> wrote:
> Warren Togami wrote:
>> You are misunderstanding the question.  A single DNS query could
>> respond different numbers meaning they are hits on different lists.
>> Your lists that are subsets or supersets of other lists can easily use
>> this.  The querying software need only to know what each result means.
>
> Not saying that this is a bad idea, but it does have its limitations.
> For example, some lists are into the hundreds of megabytes large, and
> getting the whole file rsncned and updated can take more than several
> minutes. Often, such lists update only once or twice per hour, if even
> that often.

Hmm ... interesting. If implemented via rbldnsd, each list could be
maintained in a separate file, and since rbldnsd can be configured to
build a single zone using multiple files on the back end, different
lists could be refreshed at different rates.

Your comments about tradeoffs and bitmasking still stand, of course.

Royce


spamassassin-users at lists

Oct 7, 2009, 1:42 AM

Post #3 of 4 (438 views)
Permalink
Re: consolidating DNSBLs into a single query (was Spam Eating Monkey?) [In reply to]

On 07/10/2009 05:19, Rob McEwen wrote:

> Also, this loses the ability to *score* on multiple lists... unless you
> use a bitmasked scoring system whereby one list gets assigned ".2",
> another ".4", another ".8", on to ".128". But that leaves a maximum of
> only 7 lists. Sure, you can add more than 7 by employing other octets in
> the "answer IP", but that only severely complicates matters.
>
> And as it stands, you'd also have the complexity of getting the spam
> filter to parse, understand, and react properly to those bitmasks.

I don't understand the logic of that. Ie, why you'd need to use
bitmasking? zen.spamhaus.org is a combination of various different lists
and returns multiple values like this:

mike [at] have:~$ host -t a 2.0.0.127.zen.spamhaus.org
2.0.0.127.zen.spamhaus.org A 127.0.0.4
2.0.0.127.zen.spamhaus.org A 127.0.0.10
2.0.0.127.zen.spamhaus.org A 127.0.0.2
mike [at] have:~$

It's perfectly easy for SpamAssassin to see that three different values
have been returned, so 127.0.0.2 is on three separate lists and that an
extra score should be applied for each of those three.

It's also quite easy to do it in Exim, eg if I wanted to block an email
in Exim if the sending ip is on both sbl.spamhaus.org and
xbl.spamhaus.org I could either do two dns lookups like this:

deny dnslists = sbl.spamhaus.org
dnslists = xbl.spamhaus.org

Or I could do it with a single dns lookup like this:

deny dnslists = zen.spamhaus.org=127.0.0.2
dnslists = zen.spamhaus.org=127.0.0.4

You can be 100% backwards compatible by leaving all of your lists as
they are, but then adding another one which is a combined version of all
of them...

--
Mike Cardwell - IT Consultant and LAMP developer
Cardwell IT Ltd. (UK Reg'd Company #06920226) http://cardwellit.com/


rob at invaluement

Oct 7, 2009, 6:05 AM

Post #4 of 4 (428 views)
Permalink
Re: consolidating DNSBLs into a single query (was Spam Eating Monkey?) [In reply to]

Mike Cardwell wrote:
> I don't understand the logic of that. Ie, why you'd need to use
> bitmasking? zen.spamhaus.org is a combination of various different
> lists and returns multiple values like this:<SNIP>

If every list is an "outright block" list, then you are correct. My
point applies to situations where some lists are used in scoring mode,
and where there is a desire to be able to calculate a score based on
exactly which lists hit on a particular sending IP.

But even if someone tries this with all "outright block lists", and uses
rbldnsd's built in ability to consolidate lists, then there are still
two problems:

(a) for auditing purposes, there'd be no way to tell *which* lists hit
on that IP since many use the same return codes

(b) some hundreds-of-MB-large lists which previously could have used the
lower-memory "ip4tset" would have to revert back to slower and
higher-memory-usage "ip4set", fwiw

Again, not saying these problems can't be solved, only pointing them out
so that anyone who cares to try can know what they need to do, or need
to expect.

--
Rob McEwen
http://dnsbl.invaluement.com/
rob [at] invaluement
+1 (478) 475-9032

SpamAssassin users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.