Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: SpamAssassin: users

Use message size in a rule?

 

 

SpamAssassin users RSS feed   Index | Next | Previous | View Threaded


rgraves at carleton

Sep 25, 2009, 10:10 AM

Post #1 of 4 (502 views)
Permalink
Use message size in a rule?

For HTML content, we can check the length with

eval:html_eval('length', '< 384')

but I don't see anything similar for body or rawbody. For my purposes, the Content-Length from the spamc connection would do, but it doesn't seem to be exposed.

I see at least two plugins looking at length:

ImageInfo.pm: my $textlen = length(join('',@$body));
TextCat.pm: my $len = length($body);

but it seems a waste to make multiple in-memory copies a large message just to see how big it is.

The bigger picture: I'm working on some ISP/.edu phishing rules inspired by the old 419 rules... lots of words and short phrases indicating an attempt to get our account information (either through email or free web form sites), and a meta rule that fires only if there are several hits. Due to the risk of false positives on long messages, I'd only like to apply the rules to messages with short bodies.
--
Rich Graves http://claimid.com/rcgraves
Carleton.edu Sr UNIX and Security Admin
CMC135: 507-222-7079 Cell: 952-292-6529


guenther at rudersport

Sep 26, 2009, 3:25 AM

Post #2 of 4 (460 views)
Permalink
Re: Use message size in a rule? [In reply to]

On Fri, 2009-09-25 at 12:10 -0500, Rich Graves wrote:
> The bigger picture: I'm working on some ISP/.edu phishing rules
> inspired by the old 419 rules... lots of words and short phrases
> indicating an attempt to get our account information (either through
> email or free web form sites), and a meta rule that fires only if
> there are several hits. Due to the risk of false positives on long
> messages, I'd only like to apply the rules to messages with short
> bodies.

This is a plain RE rule I once wrote, to limit some rule to really short
messages only.

rawbody __KB_RAWBODY_200 /^.{0,200}$/s

Yeah, rawbody, but properly anchored and limited, no backtracking, just
consumption, and will stop early once your threshold is reached. Should
be quite cheap indeed. HTH


--
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


wtogami at redhat

Sep 26, 2009, 7:17 AM

Post #3 of 4 (460 views)
Permalink
Re: Use message size in a rule? [In reply to]

On 09/26/2009 06:25 AM, Karsten Bräckelmann wrote:
> On Fri, 2009-09-25 at 12:10 -0500, Rich Graves wrote:
>> The bigger picture: I'm working on some ISP/.edu phishing rules
>> inspired by the old 419 rules... lots of words and short phrases
>> indicating an attempt to get our account information (either through
>> email or free web form sites), and a meta rule that fires only if
>> there are several hits. Due to the risk of false positives on long
>> messages, I'd only like to apply the rules to messages with short
>> bodies.
>
> This is a plain RE rule I once wrote, to limit some rule to really short
> messages only.
>
> rawbody __KB_RAWBODY_200 /^.{0,200}$/s
>
> Yeah, rawbody, but properly anchored and limited, no backtracking, just
> consumption, and will stop early once your threshold is reached. Should
> be quite cheap indeed. HTH

I suspect meta limiting Adam's IXHASH rules with a minimum size subrule
would eliminate many of the IXHASH false positives. I was using his
IXHASH plugin for a while, but stopped because I noticed too many FP's
on short e-mails. I wonder if his IXHASH plugin is suitable to put into
the sandbox for actual statistical testing.

Warren


hege at hege

Sep 26, 2009, 9:37 AM

Post #4 of 4 (464 views)
Permalink
Re: Use message size in a rule? [In reply to]

On Sat, Sep 26, 2009 at 12:25:32PM +0200, Karsten Bräckelmann wrote:
> On Fri, 2009-09-25 at 12:10 -0500, Rich Graves wrote:
> > The bigger picture: I'm working on some ISP/.edu phishing rules
> > inspired by the old 419 rules... lots of words and short phrases
> > indicating an attempt to get our account information (either through
> > email or free web form sites), and a meta rule that fires only if
> > there are several hits. Due to the risk of false positives on long
> > messages, I'd only like to apply the rules to messages with short
> > bodies.
>
> This is a plain RE rule I once wrote, to limit some rule to really short
> messages only.
>
> rawbody __KB_RAWBODY_200 /^.{0,200}$/s
>
> Yeah, rawbody, but properly anchored and limited, no backtracking, just
> consumption, and will stop early once your threshold is reached. Should
> be quite cheap indeed. HTH

I've used lookahead for that, since then all the matching text isn't saved
in SA internals..

/^(?=.{0,200}$)/s

SpamAssassin users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.