Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: SpamAssassin: users

expire - theory and practical

 

 

SpamAssassin users RSS feed   Index | Next | Previous | View Threaded


lists07 at abbacomm

Nov 18, 2009, 10:12 AM

Post #1 of 3 (548 views)
Permalink
expire - theory and practical

looking for theoretical and practical insight on general multi domain email
hosting type servers...

Q1) on high volume email servers, is it wise to expire more than once a day,
or is once a day the "right" amount so that once is not always in some form
of expiring ???

the setup questions is so that we can get to what i am really driving at....

Q2) on a low, or much lower volume volume email server, is it best to expire
once a day or should it be done less frequently so that there is a better
set of data for bayes?

one one server, we have been doing it once a day, yet i am wondering if we
should do it only once or twice a week to have better info in the bayes data
set.

thanks in advance

- rh


scheidell at secnap

Nov 18, 2009, 10:29 AM

Post #2 of 3 (500 views)
Permalink
Re: expire - theory and practical [In reply to]

R-Elists wrote:
> looking for theoretical and practical insight on general multi domain email
> hosting type servers...
>
> Q1) on high volume email servers, is it wise to expire more than once a day,
> or is once a day the "right" amount so that once is not always in some form
> of expiring ???
>
> the setup questions is so that we can get to what i am really driving at....
>
> Q2) on a low, or much lower volume volume email server, is it best to expire
> once a day or should it be done less frequently so that there is a better
> set of data for bayes?
>
> one one server, we have been doing it once a day, yet i am wondering if we
> should do it only once or twice a week to have better info in the bayes data
> set.
>
>
bayes expire itself has some 'smarts' built into it that will (depending
on the max tokens settings), try to make a choice as to what to expire.

even if you run expire every day, on a low volume server,it might not
expire anything

So, id say, make sure you use mysql plugin for bayes/awl and expire
every day, either at midnight, or 3am.

(why..? )
you want to avoid DST changes from trying to trigger a second bayes expire.
Not that any bayes expire would take from 1am to 2am, but why chance it?


--
Michael Scheidell, CTO
Phone: 561-999-5000, x 1259
> *| *SECNAP Network Security Corporation

* Certified SNORT Integrator
* 2008-9 Hot Company Award Winner, World Executive Alliance
* Five-Star Partner Program 2009, VARBusiness
* Best Anti-Spam Product 2008, Network Products Guide
* King of Spam Filters, SC Magazine 2008

_________________________________________________________________________
This email has been scanned and certified safe by SpammerTrap(r).
For Information please see http://www.secnap.com/products/spammertrap/
_________________________________________________________________________


rwmaillists at googlemail

Nov 18, 2009, 6:10 PM

Post #3 of 3 (486 views)
Permalink
Re: expire - theory and practical [In reply to]

On Wed, 18 Nov 2009 10:12:54 -0800
"R-Elists" <lists07 [at] abbacomm> wrote:

>
> looking for theoretical and practical insight on general multi domain
> email hosting type servers...
>
> Q1) on high volume email servers, is it wise to expire more than once
> a day, or is once a day the "right" amount so that once is not always
> in some form of expiring ???
> ...
> Q2) on a low, or much lower volume volume email server, is it best to
> expire once a day or should it be done less frequently so that there
> is a better set of data for bayes?

I think it's worth reading the sa-learn manpage on this - paying
particular attention to the EXPIRE LOGIC (particularly 'the definition
of "wierd"') and the ESTIMATION PASS LOGIC.

The expiry logic really should be very simple: we want to reduce the
token count to 75% of nominal and then convert the token reduction to a
new atime cutoff. It's actually quite straightforward to simply compute
an accurate estimate of the atime cutoff; unfortunately SA doesn't
do that that, it uses an overcomplicated and unreliable two stage
process.

The first stage is an estimate based on the previous expiry, if this
result fails the aforementioned "weird" test then a crude estimate
based on powers of two multiples of 12hours (up to 256 days) is used
instead. The second stage is unpredictable, and can remove
huge numbers of tokens, so you want to avoid it. One of the
definitions of weird is that the tokens to be expired are between 2/3
and 3/2 of the previous expiry count, so you should be looking to
expire roughly the same amount on each call. If you expire too often
the first stage will never "lock-on", and you will get big swings in
the token counts. I'd suggest about 20% of the nominal size on each
call.

All of this is really optimised for autoexpire. I wonder if it's
possible to turn-off autoexpire and then run something like

sa-learn --cf='autoexpire 1'

from crontab instead of --force-expire

SpamAssassin users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.