Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: SpamAssassin: devel

Rule updates are too old

 

 

SpamAssassin devel RSS feed   Index | Next | Previous | View Threaded


darxus at chaosreigns

Nov 21, 2011, 10:42 AM

Post #1 of 24 (1324 views)
Permalink
Rule updates are too old

SpamAssassin version 3.3.0 has not had a rule update since 2011-11-02.
SpamAssassin version 3.3.1 has not had a rule update since 2011-11-02.
SpamAssassin version 3.3.2 has not had a rule update since 2011-11-02.


darxus at chaosreigns

Nov 21, 2011, 10:46 AM

Post #2 of 24 (1291 views)
Permalink
Re: Rule updates are too old [In reply to]

This is now in a cron job set to run every Monday at noon. Once updates
are fixed, I think I'll run it daily.

It's only looking at the version number in the DNS records, not verifying
that the updates actually exist, contain anything, contain contents that
changed, etc.

The source is here: http://www.chaosreigns.com/sa/update-version-mon.pl

I'm not including 3.4.0 because I haven't bothered handling CNAMEs.

On 11/21, darxus [at] chaosreigns wrote:
> SpamAssassin version 3.3.0 has not had a rule update since 2011-11-02.
> SpamAssassin version 3.3.1 has not had a rule update since 2011-11-02.
> SpamAssassin version 3.3.2 has not had a rule update since 2011-11-02.
>

--
"Blades don't need reloading." - The Zombie Survival Guide by Max Brooks
http://www.ChaosReigns.com


darxus at chaosreigns

Nov 28, 2011, 9:00 AM

Post #3 of 24 (1262 views)
Permalink
Rule updates are too old [In reply to]

SpamAssassin version 3.3.0 has not had a rule update since 2011-11-02.
SpamAssassin version 3.3.1 has not had a rule update since 2011-11-02.
SpamAssassin version 3.3.2 has not had a rule update since 2011-11-02.


darxus at chaosreigns

Dec 5, 2011, 9:00 AM

Post #4 of 24 (1254 views)
Permalink
Rule updates are too old [In reply to]

SpamAssassin version 3.3.0 has not had a rule update since 2011-11-02.
SpamAssassin version 3.3.1 has not had a rule update since 2011-11-02.
SpamAssassin version 3.3.2 has not had a rule update since 2011-11-02.


darxus at chaosreigns

May 31, 2012, 6:54 PM

Post #5 of 24 (1178 views)
Permalink
Re: Rule updates are too old [In reply to]

On 05/31, John Hardin wrote:
> We appear to have just crossed the threshold, maybe we'll get a
> rules update this weekend...
>
> http://www.chaosreigns.com/dnswl/tot.svg
>
> woohoo!

Wow, cool.

Looks like that's largely axb-generic, went from 41,619 to 103,398 spams
for the last two weekly / net masschecks. The second largest spam corpus
contributor is llanga, at 5,756. 63% of the spam is from axb-generic.

--
"Just because you're offended, doesn't mean you're right." - Ricky Gervais
http://www.ChaosReigns.com


KMcGrail at PCCC

May 31, 2012, 6:55 PM

Post #6 of 24 (1182 views)
Permalink
Re: Rule updates are too old [In reply to]

We appear to have just crossed the threshold, maybe we'll get a rules
update this weekend...
>
> http://www.chaosreigns.com/dnswl/tot.svg
>
> woohoo!
>
Hot Damn.


axb.lists at gmail

Jun 1, 2012, 12:01 AM

Post #7 of 24 (1187 views)
Permalink
Re: Rule updates are too old [In reply to]

On 06/01/2012 03:54 AM, darxus [at] chaosreigns wrote:
> On 05/31, John Hardin wrote:
>> We appear to have just crossed the threshold, maybe we'll get a
>> rules update this weekend...
>>
>> http://www.chaosreigns.com/dnswl/tot.svg
>>
>> woohoo!
>
> Wow, cool.
>
> Looks like that's largely axb-generic, went from 41,619 to 103,398 spams
> for the last two weekly / net masschecks. The second largest spam corpus
> contributor is llanga, at 5,756. 63% of the spam is from axb-generic.
>

dunno if this is so cool.

if this corpus is supplying 63% of the spam, this may also mean that
results are biased.


KMcGrail at PCCC

Jun 1, 2012, 3:57 AM

Post #8 of 24 (1177 views)
Permalink
Re: Rule updates are too old [In reply to]

On 6/1/2012 3:01 AM, Axb wrote:
> On 06/01/2012 03:54 AM, darxus [at] chaosreigns wrote:
>> On 05/31, John Hardin wrote:
>>> We appear to have just crossed the threshold, maybe we'll get a
>>> rules update this weekend...
>>>
>>> http://www.chaosreigns.com/dnswl/tot.svg
>>>
>>> woohoo!
>>
>> Wow, cool.
>>
>> Looks like that's largely axb-generic, went from 41,619 to 103,398 spams
>> for the last two weekly / net masschecks. The second largest spam
>> corpus
>> contributor is llanga, at 5,756. 63% of the spam is from axb-generic.
>>
>
> dunno if this is so cool.
>
> if this corpus is supplying 63% of the spam, this may also mean that
> results are biased.

They are biased but we'll build on it and get more masscheckers!


matthias at leisi

Jun 1, 2012, 4:50 AM

Post #9 of 24 (1180 views)
Permalink
Re: Rule updates are too old [In reply to]

On Fri, Jun 1, 2012 at 12:57 PM, Kevin A. McGrail <KMcGrail [at] pccc> wrote:

> They are biased but we'll build on it and get more masscheckers!

I can provide a mostly spamtrap-driven corpus. It's size is basically
endless, I'm just throwing most of it away currently.

-- Matthias


axb.lists at gmail

Jun 1, 2012, 4:55 AM

Post #10 of 24 (1178 views)
Permalink
Re: Rule updates are too old [In reply to]

On 06/01/2012 01:50 PM, Matthias Leisi wrote:
> On Fri, Jun 1, 2012 at 12:57 PM, Kevin A. McGrail<KMcGrail [at] pccc> wrote:
>
>> They are biased but we'll build on it and get more masscheckers!
>
> I can provide a mostly spamtrap-driven corpus. It's size is basically
> endless, I'm just throwing most of it away currently.
>
> -- Matthias

Matthias

I'll contact you offlist (or as per Swinog : "offline" :)


me at junc

Jun 1, 2012, 7:28 AM

Post #11 of 24 (1177 views)
Permalink
Re: Rule updates are too old [In reply to]

Den 2012-06-01 03:54, darxus [at] chaosreigns skrev:

>> http://www.chaosreigns.com/dnswl/tot.svg

from 2007 ?

> Looks like that's largely axb-generic, went from 41,619 to 103,398
> spams
> for the last two weekly / net masschecks. The second largest spam
> corpus
> contributor is llanga, at 5,756. 63% of the spam is from
> axb-generic.

how does others make spamcorpus with mta rejecting spam from rbl &
clamav milter ?

seperate mta with no spam/virus filtering ?


axb.lists at gmail

Jun 1, 2012, 7:35 AM

Post #12 of 24 (1181 views)
Permalink
Re: Rule updates are too old [In reply to]

On 06/01/2012 04:28 PM, Benny Pedersen wrote:
> Den 2012-06-01 03:54, darxus [at] chaosreigns skrev:
>
>>> http://www.chaosreigns.com/dnswl/tot.svg
>
> from 2007 ?
>
>> Looks like that's largely axb-generic, went from 41,619 to 103,398 spams
>> for the last two weekly / net masschecks. The second largest spam corpus
>> contributor is llanga, at 5,756. 63% of the spam is from axb-generic.
>
> how does others make spamcorpus with mta rejecting spam from rbl &
> clamav milter ?
>
> seperate mta with no spam/virus filtering ?

trap servers don't use rbls or AV tools.

If anything they'll discard after accepting.


me at junc

Jun 1, 2012, 7:44 AM

Post #13 of 24 (1182 views)
Permalink
Re: Rule updates are too old [In reply to]

Den 2012-06-01 16:35, Axb skrev:

> trap servers don't use rbls or AV tools.
> If anything they'll discard after accepting.

super i start building one new mta so, if one like to help me send it
offlist


darxus at chaosreigns

Jun 1, 2012, 8:05 AM

Post #14 of 24 (1182 views)
Permalink
Re: Rule updates are too old [In reply to]

On 06/01, Benny Pedersen wrote:
> Den 2012-06-01 03:54, darxus [at] chaosreigns skrev:
>
> >>http://www.chaosreigns.com/dnswl/tot.svg
>
> from 2007 ?

Yes, that data goes back to 2007. And the last tick on the right is July
2012, which the data lines haven't reached yet.

> how does others make spamcorpus with mta rejecting spam from rbl &
> clamav milter ?

I actually stopped blocking on RBLs at my MTA to get better data for this.
"Real" spam sent to real users and manually verified is better than what
can easily be harvested in very large quantities with spamtraps. I'd say
the spam that gets through blocking on RBLs to real users is also much
better than what you can get from a spam trap.

What matters the most is the stuff that's hard to catch. And I fear
that is currently massively under-represented. It doesn't need to be
all spam that was sent to you in order to be useful for re-scoring.

I think when I started I was automatically rejecting on RBLs and
automatically rejecting everything over some spamassassin threshold, and
the people on this list at the time had no objection to me contributing the
spam that still got through. That's the most important stuff.

Spam traps are obviously very useful, but I'd say they're far from
representative of the spam end users ever see getting through their spam
filters.

--
"Let's just say that if complete and utter chaos was lightning, then
he'd be the sort to stand on a hilltop in a thunderstorm wearing wet
copper armour and shouting 'All gods are bastards'." - The Color of Magic
http://www.ChaosReigns.com


darxus at chaosreigns

Jul 18, 2012, 11:48 AM

Post #15 of 24 (1048 views)
Permalink
Re: Rule updates are too old [In reply to]

On 07/18, John Hardin wrote:
> On Wed, 18 Jul 2012, darxus [at] chaosreigns wrote:
>
> >SpamAssassin version 3.3.0 has not had a rule update since 2012-07-07.
> >SpamAssassin version 3.3.1 has not had a rule update since 2012-07-07.
> >SpamAssassin version 3.3.2 has not had a rule update since 2012-07-07.
> >
> >20120717: Spam or ham is below threshold of 150,000: http://ruleqa.spamassassin.org/?daterev=20120717
> >20120717: Spam: 239513, Ham: 122986
>
> Something is screwy here. I just checked the Corpus Quality on a
> recent nightly run and it reports only ~300 messages in my combined
> ham and spam corpora. I upload a _lot_ more than that.

bb-jhardin? I see 247 hams for this date. Just to be sure, you're aware
it only uses ham that is no more than 6 years old, right?

--
"I refuse to tip toe through life only to arrive safely at death."
http://www.ChaosReigns.com


jhardin at impsec

Jul 18, 2012, 12:13 PM

Post #16 of 24 (1048 views)
Permalink
Re: Rule updates are too old [In reply to]

On Wed, 18 Jul 2012, darxus [at] chaosreigns wrote:

> On 07/18, John Hardin wrote:
>> On Wed, 18 Jul 2012, darxus [at] chaosreigns wrote:
>>
>>> SpamAssassin version 3.3.0 has not had a rule update since 2012-07-07.
>>> SpamAssassin version 3.3.1 has not had a rule update since 2012-07-07.
>>> SpamAssassin version 3.3.2 has not had a rule update since 2012-07-07.
>>>
>>> 20120717: Spam or ham is below threshold of 150,000: http://ruleqa.spamassassin.org/?daterev=20120717
>>> 20120717: Spam: 239513, Ham: 122986
>>
>> Something is screwy here. I just checked the Corpus Quality on a
>> recent nightly run and it reports only ~300 messages in my combined
>> ham and spam corpora. I upload a _lot_ more than that.
>
> bb-jhardin? I see 247 hams for this date. Just to be sure, you're aware
> it only uses ham that is no more than 6 years old, right?

I should have a lot more than 247 hams in the last 6 years.

The spam count in bb-jhardin-fraud is also very low.

--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin [at] impsec FALaholic #11174 pgpk -a jhardin [at] impsec
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
Teddy Kennedy's car has killed more people than any of my guns.
-----------------------------------------------------------------------
Today: the 43rd anniversary of Mary Jo Kopechne's death by drowning


darxus at chaosreigns

Jul 21, 2012, 4:41 PM

Post #17 of 24 (1033 views)
Permalink
Re: Rule updates are too old [In reply to]

Replying to you and the spamassassin dev list.

On 07/21, lrinetti [at] libero wrote:
> Dear Sir
> i apologise for this trivail question, but can you explain me what is the
> benefit of this mail ?
>
> SpamAssassin version 3.3.0 has not had a rule update since 2012-07-07.
> SpamAssassin version 3.3.1 has not had a rule update since 2012-07-07.
> SpamAssassin version 3.3.2 has not had a rule update since 2012-07-07.

I think this part is clear. SpamAssassin rule updates haven't been
updated since then. The stuff you should be downloading by automatically
running "sa-update" daily.

> 20120720: Spam or ham is below threshold of 150,000:
> http://ruleqa.spamassassin.org/?daterev=20120720 20120720: Spam: 245209,
> Ham: 122666

And this shows the probable reason why.

People provide data to the SpamAssassin project on their spams and non-spams
(hams) like this: https://wiki.apache.org/spamassassin/NightlyMassCheck

If we get data on less than 150,000 hams or spams, rule (score) generation
doesn't happen. As you can see above, we're below that threshold on ham.
So rules haven't been updated.

A substantial part of the reason for this is that new masscheck accounts
have not been allowed for some time, as mentioned on the above web page.

> I read this on various mailing list related to Spamassassin Users, but i am
> not an experienced SA user.
>
> Thank You
>
> P.S. why the rules are no more updated since 2012-07-07 ?
> I'm using SA with Exim4 4.74 in an Ubuntu Server 11.04 and it is working fine,
> also with ClamAV, Courier IMAP/POP server, sa-exim, razor, pyzor.

Yup, it should not cause your SpamAssassin to stop functioning.

--
"Believe nothing, no matter where you read it or who has said it, even
if I have said it, unless it agrees with your own reason and your own
common sense." - Buddha, 563-483 B.C.
http://www.ChaosReigns.com


darxus at chaosreigns

Aug 19, 2012, 9:08 AM

Post #18 of 24 (943 views)
Permalink
Re: Rule updates are too old [In reply to]

It's (only) checking the DNS record, yes. It has not changed since
yesterday.

darxus [at] pani:~/progs/sa$ host -t txt 2.3.3.updates.spamassassin.org
2.3.3.updates.spamassassin.org descriptive text "1374176"

2012-08-16 3.3.2 1373273
2012-08-17 3.3.2 1373758
2012-08-18 3.3.2 1374176
2012-08-19 3.3.2 1374176

It runs at noon Eastern. What time does it get published? Might be good
ot run this script at a different time?


On 08/19, Kevin A. McGrail wrote:
> This might be a bug. I think we have published. Are you checking with dns
> or how are you checking?
> Regards,
> KAM
>
> darxus [at] chaosreigns wrote:
>
> SpamAssassin version 3.3.0 has not had a rule update since 2012-08-18.
> SpamAssassin version 3.3.1 has not had a rule update since 2012-08-18.
> SpamAssassin version 3.3.2 has not had a rule update since 2012-08-18.
>
> 20120818: Spam: 596883, Ham: 209684

--
"Blades don't need reloading." - The Zombie Survival Guide by Max Brooks
http://www.ChaosReigns.com


KMcGrail at PCCC

Aug 20, 2012, 6:48 AM

Post #19 of 24 (933 views)
Permalink
Re: Rule updates are too old [In reply to]

On 8/19/2012 12:08 PM, darxus [at] chaosreigns wrote:
> It's (only) checking the DNS record, yes. It has not changed since
> yesterday.
>
> darxus [at] pani:~/progs/sa$ host -t txt 2.3.3.updates.spamassassin.org
> 2.3.3.updates.spamassassin.org descriptive text "1374176"
>
> 2012-08-16 3.3.2 1373273
> 2012-08-17 3.3.2 1373758
> 2012-08-18 3.3.2 1374176
> 2012-08-19 3.3.2 1374176
>
> It runs at noon Eastern. What time does it get published? Might be good
> ot run this script at a different time?

From Checking Cron output, I think the bug is with the counts you have
possibly based on the times they are submitted.

SPAM: 51876 (150000 required)
Insufficient spam corpus to generate scores; aborting.
Exit Status 9 is not zero for do-nightly-rescore-example


So it is correct we did not publish a rule on 8/18. What's odd is the different counts of spam files but I'm guessing that's based on timing.

This finished at Date: Sun, 19 Aug 2012 02:57:53 GMT

Regards,
KAM


> On 08/19, Kevin A. McGrail wrote:
>> This might be a bug. I think we have published. Are you checking with dns
>> or how are you checking?
>> Regards,
>> KAM
>>
>> darxus [at] chaosreigns wrote:
>>
>> SpamAssassin version 3.3.0 has not had a rule update since 2012-08-18.
>> SpamAssassin version 3.3.1 has not had a rule update since 2012-08-18.
>> SpamAssassin version 3.3.2 has not had a rule update since 2012-08-18.
>>
>> 20120818: Spam: 596883, Ham: 209684


--
*Kevin A. McGrail*
President

Peregrine Computer Consultants Corporation
3927 Old Lee Highway, Suite 102-C
Fairfax, VA 22030-2422

http://www.pccc.com/

703-359-9700 x50 / 800-823-8402 (Toll-Free)
703-359-8451 (fax)
KMcGrail [at] PCCC <mailto:kmcgrail [at] pccc>
Attachments: pccc_logo.gif (10.2 KB)


darxus at chaosreigns

Aug 20, 2012, 10:10 AM

Post #20 of 24 (932 views)
Permalink
Re: Rule updates are too old [In reply to]

On 08/20, Kevin A. McGrail wrote:
> SPAM: 51876 (150000 required)
> Insufficient spam corpus to generate scores; aborting.
> Exit Status 9 is not zero for do-nightly-rescore-example
>
>
> So it is correct we did not publish a rule on 8/18. What's odd is the different counts of spam files but I'm guessing that's based on timing.
>
> This finished at Date: Sun, 19 Aug 2012 02:57:53 GMT

So if I ran my script at this time it would be more likely to be
representative of what the rule generation stuff sees?

$ date -d'Sun, 19 Aug 2012 02:57:53 GMT'
Sat Aug 18 22:57:53 EDT 2012


I started logging versions every 5 minutes (EDT):

2012-08-19-23-55 3.3.2 1374176
2012-08-20-00-00 3.3.2 1374711

So maybe it's best to check half an hour after midnight EDT? Changed to
that.

I really wish there was a way to do cron jobs independent of
daylight savings time on a machine configured to use it normally. Hmm, I
can convert all my cron jobs to UTC using CRON_TZ:
http://answers.yahoo.com/question/index?qid=20061005013730AAlCxwO

--
"You only truly own what you can carry at a dead run."
- 14th & 15th century Landsknechts
http://www.ChaosReigns.com


KMcGrail at PCCC

Aug 27, 2012, 4:58 PM

Post #21 of 24 (902 views)
Permalink
Re: Rule updates are too old [In reply to]

On 8/20/2012 1:10 PM, darxus [at] chaosreigns wrote:
> So if I ran my script at this time it would be more likely to be
> representative of what the rule generation stuff sees?
I don't actually know what resources your script uses to answer that
question but I would say that makes a lot of sense.

Regards,
KAM


darxus at chaosreigns

Aug 27, 2012, 6:21 PM

Post #22 of 24 (903 views)
Permalink
Re: Rule updates are too old [In reply to]

On 08/27, Kevin A. McGrail wrote:
> On 8/20/2012 1:10 PM, darxus [at] chaosreigns wrote:
> >So if I ran my script at this time it would be more likely to be
> >representative of what the rule generation stuff sees?
> I don't actually know what resources your script uses to answer that
> question but I would say that makes a lot of sense.

Thanks. It scrapes it from http://ruleqa.spamassassin.org/ . It's kind of
buried in there - there's a small font javascript link to it.

--
"My definition of a free society is a society where it is safe to be
unpopular." - Adlai E. Stevenson Jr.
http://www.ChaosReigns.com


KMcGrail at PCCC

Aug 27, 2012, 6:25 PM

Post #23 of 24 (901 views)
Permalink
Re: Rule updates are too old [In reply to]

On 8/27/2012 9:21 PM, darxus [at] chaosreigns wrote:
>
> Thanks. It scrapes it from http://ruleqa.spamassassin.org/ . It's kind of
> buried in there - there's a small font javascript link to it.

Hmm perhaps we tie this into the server cron jobs instead? But
otherwise your information is still fairly close and useful at least to
know an update didn't come out.

Regards,
KAM


darxus at chaosreigns

Aug 27, 2012, 6:31 PM

Post #24 of 24 (901 views)
Permalink
Re: Rule updates are too old [In reply to]

On 08/27, Kevin A. McGrail wrote:
> On 8/27/2012 9:21 PM, darxus [at] chaosreigns wrote:
> >
> >Thanks. It scrapes it from http://ruleqa.spamassassin.org/ . It's kind of
> >buried in there - there's a small font javascript link to it.
>
> Hmm perhaps we tie this into the server cron jobs instead? But
> otherwise your information is still fairly close and useful at least
> to know an update didn't come out.

Yup, that would probably be better. I just did this because it's what I
have access to. I just updated the copy of the script on my web site:
http://www.chaosreigns.com/sa/update-version-mon.pl

--
"Hermes will help you get your wagon unstuck, but only if you push on it."
- Greek Alphabet Oracle
http://www.ChaosReigns.com

SpamAssassin devel RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.