Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: SpamAssassin: users

Identifying actual performance on rules

 

 

SpamAssassin users RSS feed   Index | Next | Previous | View Threaded


sollog at digiraticonsulting

Jan 22, 2012, 1:49 PM

Post #1 of 4 (296 views)
Permalink
Identifying actual performance on rules

I am trying to locate reason for performance spikes. I have read the various wiki pages, and they suggest solutions but not a way to identify the bottleneck. Is there way to increase logging so that I can begin to identify or rule out the actual performance bottlenecks?


Munroe Sollog
Digirati Consulting
sollog [at] digiraticonsulting
(610) 332-7234 x805
Attachments: smime.p7s (4.70 KB)


jhardin at impsec

Jan 23, 2012, 8:03 AM

Post #2 of 4 (274 views)
Permalink
Re: Identifying actual performance on rules [In reply to]

On Sun, 22 Jan 2012, Munroe Sollog wrote:

> I am trying to locate reason for performance spikes. I have read the
> various wiki pages, and they suggest solutions but not a way to identify
> the bottleneck. Is there way to increase logging so that I can begin to
> identify or rule out the actual performance bottlenecks?

Sure. Put this into a .pre file:

loadplugin HitFreqsRuleTiming ./HitFreqsRuleTiming.pm

(set the path appropriately for where the .pm file actually lives)

And then run your SpamAssassin in debug mode, for example:

spamassassin -L -t --debug area=all,rules < "$testmsg"

You should then see logging output like:

Jan 13 06:27:45.701 [30309] dbg: rules: [...] rulename ovl(s) max(s) #run %tot
Jan 13 06:27:45.701 [30309] dbg: rules: [...] AWL 0.1862 0.1862 1 35.98%
Jan 13 06:27:45.701 [30309] dbg: rules: [...] ALL_FULL 0.0485 0.0485 1 9.38%
Jan 13 06:27:45.701 [30309] dbg: rules: [...] ALL_RAWBODY 0.0077 0.0077 1 1.48%
Jan 13 06:27:45.701 [30309] dbg: rules: [...] T_LONG_HEADER_LINE_80 0.0052 0.0052 1 1.01%


--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin [at] impsec FALaholic #11174 pgpk -a jhardin [at] impsec
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
Any time law enforcement becomes a revenue center, the system
becomes corrupt.
-----------------------------------------------------------------------
Today: John Moses Browning's 157th Birthday


guenther at rudersport

Jan 23, 2012, 4:50 PM

Post #3 of 4 (269 views)
Permalink
Re: Identifying actual performance on rules [In reply to]

On Mon, 2012-01-23 at 08:03 -0800, John Hardin wrote:
> On Sun, 22 Jan 2012, Munroe Sollog wrote:
>
> > I am trying to locate reason for performance spikes. I have read the
> > various wiki pages, and they suggest solutions but not a way to identify
> > the bottleneck. Is there way to increase logging so that I can begin to
> > identify or rule out the actual performance bottlenecks?

Using debugging as mentioned by John will definitely be helpful to
pinpoint a specific sub-system or rule causing higher load. However,
this pretty much requires sample messages causing the load -- basically
implied by running the stand-alone 'spamassassin' script.

Enabling debug mode on a production system should be done with care and
while closely monitoring the system -- it will result in GOBS of debug
info being logged. Mentioning this, since I understand the OP to not yet
have samples, but generally noticing "performance spikes".

Thus, step one should be to find some samples causing an increased load,
and then to investigate further.


> Sure. Put this into a .pre file:
>
> loadplugin HitFreqsRuleTiming ./HitFreqsRuleTiming.pm
>
> (set the path appropriately for where the .pm file actually lives)
>
> And then run your SpamAssassin in debug mode, for example:
>
> spamassassin -L -t --debug area=all,rules < "$testmsg"
>
> You should then see logging output like:
>
> Jan 13 06:27:45.701 [30309] dbg: rules: [...] rulename ovl(s) max(s) #run %tot
> Jan 13 06:27:45.701 [30309] dbg: rules: [...] AWL 0.1862 0.1862 1 35.98%
> Jan 13 06:27:45.701 [30309] dbg: rules: [...] ALL_FULL 0.0485 0.0485 1 9.38%
> Jan 13 06:27:45.701 [30309] dbg: rules: [...] ALL_RAWBODY 0.0077 0.0077 1 1.48%
> Jan 13 06:27:45.701 [30309] dbg: rules: [...] T_LONG_HEADER_LINE_80 0.0052 0.0052 1 1.01%

--
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


jhardin at impsec

Jan 23, 2012, 5:59 PM

Post #4 of 4 (263 views)
Permalink
Re: Identifying actual performance on rules [In reply to]

On Tue, 24 Jan 2012, Karsten Bräckelmann wrote:

> On Mon, 2012-01-23 at 08:03 -0800, John Hardin wrote:
>> On Sun, 22 Jan 2012, Munroe Sollog wrote:
>>
>>> I am trying to locate reason for performance spikes. I have read the
>>> various wiki pages, and they suggest solutions but not a way to identify
>>> the bottleneck. Is there way to increase logging so that I can begin to
>>> identify or rule out the actual performance bottlenecks?
>
> Using debugging as mentioned by John will definitely be helpful to
> pinpoint a specific sub-system or rule causing higher load. However,
> this pretty much requires sample messages causing the load -- basically
> implied by running the stand-alone 'spamassassin' script.
>
> Enabling debug mode on a production system should be done with care and
> while closely monitoring the system -- it will result in GOBS of debug
> info being logged. Mentioning this, since I understand the OP to not yet
> have samples, but generally noticing "performance spikes".
>
> Thus, step one should be to find some samples causing an increased load,
> and then to investigate further.

Yes, very good point. Thanks for clarifying that, and apologies on my part
for assuming you had problematic messages already identified.

SA should log per-message scan times to the system log. From that you
should be able to determine a message recipient and message-ID for
messages that take a long time to scan.

You may also be experiencing inbound traffic volume spikes that may be
overloading your system.

So, if you can track when the processing spikes occur, two things to
correlate are the number of messages you're processing at that time (you
may have too many spamd children defined, or not enough memory - are you
hitting swap?), and any specific messages received at that time that take
an unusually long time to scan (they may be exercising weaknesses in the
rules).

One other thing: what version of SA are you using? Older versions have had
rules that were known to perform poorly in certain situations and that
have been fixed in newer versions.

--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin [at] impsec FALaholic #11174 pgpk -a jhardin [at] impsec
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
Vista is at best mildly annoying and at worst makes you want to
rush to Redmond, Wash. and rip somebody's liver out. -- Forbes
-----------------------------------------------------------------------
Today: John Moses Browning's 157th Birthday

SpamAssassin users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.