Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: SpamAssassin: users

Hostkarma whitelist needs something..

 

 

First page Previous page 1 2 Next page Last page  View All SpamAssassin users RSS feed   Index | Next | Previous | View Threaded


jarif at iki

Oct 12, 2009, 1:46 PM

Post #1 of 39 (1310 views)
Permalink
Hostkarma whitelist needs something..

Email: 911 Autolearn: 603 AvgScore: 13.10 AvgScanTime: 12.16 sec
Spam: 462 Autolearn: 414 AvgScore: 33.17 AvgScanTime: 10.80 sec
Ham: 449 Autolearn: 189 AvgScore: -7.55 AvgScanTime: 13.55 sec

Time Spent Running SA: 3.08 hours
Time Spent Processing Spam: 1.39 hours
Time Spent Processing Ham: 1.69 hours

TOP SPAM RULES FIRED
----------------------------------------------------------------------
RANK RULE NAME COUNT %OFMAIL %OFSPAM %OFHAM
----------------------------------------------------------------------
1 BAYES_99 449 49.29 97.19 0.00
2 DCC_CHECK 445 65.31 96.32 33.41
3 RAZOR2_CHECK 443 48.63 95.89 0.00
4 RAZOR2_CF_RANGE_51_100 441 48.41 95.45 0.00
5 DIGEST_MULTIPLE 438 48.08 94.81 0.00
6 BOTNET 428 46.98 92.64 0.00
7 HTML_MESSAGE 424 50.38 91.77 7.80
8 URIBL_BLACK 420 46.10 90.91 0.00
9 URIBL_SBL 416 45.66 90.04 0.00
10 RAZOR2_CF_RANGE_E8_51_100 414 45.44 89.61 0.00
11 RCVD_IN_BRBL_RELAY 404 47.20 87.45 5.79
12 URIBL_JP_SURBL 370 40.61 80.09 0.00
13 URIBL_WS_SURBL 350 38.42 75.76 0.00
14 URIBL_AB_SURBL 281 30.85 60.82 0.00
15 RCVD_IN_BL_SPAMCOP_NET 257 28.32 55.63 0.22
16 RDNS_NONE 252 27.66 54.55 0.00
17 MIME_HTML_ONLY 244 27.77 52.81 2.00
18 RAZOR2_CF_RANGE_E4_51_100 197 21.62 42.64 0.00
19 URIBL_OB_SURBL 176 19.32 38.10 0.00
20 URI_HEX 133 15.15 28.79 1.11
----------------------------------------------------------------------

TOP HAM RULES FIRED
----------------------------------------------------------------------
RANK RULE NAME COUNT %OFMAIL %OFSPAM %OFHAM
----------------------------------------------------------------------
1 BAYES_00 404 44.35 0.00 89.98
2 AWL 298 37.98 10.39 66.37
3 RCVD_IN_DNSWL_LOW 196 21.51 0.00 43.65
4 DCC_CHECK 150 65.31 96.32 33.41
5 RCVD_IN_HOSTKARMA_W 132 24.37 19.48 29.40
6 KHOP_RCVD_UNTRUST 106 21.51 19.48 23.61
7 DKIM_SIGNED 104 11.64 0.43 23.16
8 RCVD_IN_DNSWL_HI 85 9.33 0.00 18.93
9 RCVD_IN_HOSTKARMA_WL 77 18.33 19.48 17.15
10 RCVD_IN_DNSWL_MED 65 7.14 0.00 14.48
11 KHOP_HELO_FCRDNS 59 7.24 1.52 13.14
12 KHOP_NO_FULL_NAME 41 4.50 0.00 9.13
13 HTML_MESSAGE 35 50.38 91.77 7.80
14 RCVD_IN_BRBL_RELAY 26 47.20 87.45 5.79
15 DKIM_VERIFIED 23 2.52 0.00 5.12
16 RCVD_IN_BSP_OTHER 21 2.31 0.00 4.68
17 MIME_QP_LONG_LINE 19 2.96 1.73 4.23
18 KHOP_RCVD_TRUST 14 1.54 0.00 3.12
19 KHOP_PGP_SIGNED 14 1.54 0.00 3.12
20 ALL_TRUSTED 10 1.10 0.00 2.23
----------------------------------------------------------------------

I just started using Katz's wiki rules and it brought HOSTKARMA with it.

I have not yet seen any blacklists of HOSTKARMA, but the whitelists are
there.

RANK RULE NAME COUNT %OFMAIL %OFSPAM %OFHAM
9 RCVD_IN_HOSTKARMA_WL 77 18.33 19.48 17.15

Is this really a whitelist?

I think it needs tuning. I do not remove it, as it does not appear in
the SPAM list, but just wondering.

Confused am I?

--
http://www.iki.fi/jarif/

You're being followed. Cut out the hanky-panky for a few days.


jason at i6ix

Oct 13, 2009, 6:06 AM

Post #2 of 39 (1266 views)
Permalink
Re: Hostkarma whitelist needs something.. [In reply to]

Jari Fredriksson wrote:
> I just started using Katz's wiki rules and it brought HOSTKARMA with it.
>
> Is this really a whitelist?
>
Funny, after the discussions yesterday, I did the same thing only to
wake up this morning with a mess of mis-marked messages due to hits on
hostkarma. Until I can do further analysis, I've dropped
RCVD_IN_HOSTKARMA_BL and RCVD_IN_HOSTKARMA_WL to .001 and -.001
respectively.


antispam at khopis

Oct 13, 2009, 9:12 PM

Post #3 of 39 (1258 views)
Permalink
Re: Hostkarma whitelist needs something.. [In reply to]

Jari Fredriksson wrote:
> TOP HAM RULES FIRED
> ----------------------------------------------------------------------
> RANK RULE NAME COUNT %OFMAIL %OFSPAM %OFHAM
> ----------------------------------------------------------------------
> 5 RCVD_IN_HOSTKARMA_W 132 24.37 19.48 29.40
> 6 KHOP_RCVD_UNTRUST 106 21.51 19.48 23.61
> 9 RCVD_IN_HOSTKARMA_WL 77 18.33 19.48 17.15
> 11 KHOP_HELO_FCRDNS 59 7.24 1.52 13.14
> 12 KHOP_NO_FULL_NAME 41 4.50 0.00 9.13
> 14 RCVD_IN_BRBL_RELAY 26 47.20 87.45 5.79
> 18 KHOP_RCVD_TRUST 14 1.54 0.00 3.12
> 19 KHOP_PGP_SIGNED 14 1.54 0.00 3.12

> I just started using Katz's wiki rules and it brought HOSTKARMA with it.
>
> I have not yet seen any blacklists of HOSTKARMA, but the whitelists are
> there. Is this really a whitelist?

You may notice 5 & 9 are similar. #5 is just a pure HOSTKARMA_WL test
that khop-bl scores at -0.1 while #9 is a modified test wrapping it in a
meta that ensures it isn't also hitting DNSWL_HI or DNSWL_MED before
subtracting additional points. As noted in previous emails of mine to
the list, KHOP_RCVD_UNTRUST adds a point to any DNSWL/HOSTKARMA_W hit
that doesn't pass SPF or DKIM while KHOP_RCVD_TRUST is the opposite.

KHOP_HELO_FCRDNS hits ham far more often that I expected when I first
wrote it; it triggers when the relay's HELO doesn't match the relay IP's
rDNS. I just rescored it from 0.6 to 0.3.

KHOP_NO_FULL_NAME might be mis-firing. It's supposed to detect a
properly formatted name, in the form (sans quotes): "A K" or "Adam K"
or "A Katz" ... maybe somebody can find a flaw in my regex or an example
FP or FN? Here it is, please be careful decoding the wrapping:

# This matches foreign characters by process of elimination.
# From: must start w/ ~upper, ~letters, space/punctuation, then ~upper
header __FROM_FULL_NAME From:name =~
/^[^a-z[:punct:][:cntrl:]\d\s][^[:punct:][:cntrl:]\d\s]*[[:punct:]\s]+[^a-z[:punct:][:cntrl:]\d\s]/
meta KHOP_NO_FULL_NAME
!(__FROM_ENCODED_QP||__FROM_NEEDS_MIME||__FROM_FULL_NAME)
describe KHOP_NO_FULL_NAME Sender does not have both First and Last
names
score KHOP_NO_FULL_NAME 0.259 # keep low!


lists07 at abbacomm

Oct 13, 2009, 11:43 PM

Post #4 of 39 (1257 views)
Permalink
RE: Hostkarma whitelist needs something.. [In reply to]

> >
> Funny, after the discussions yesterday, I did the same thing
> only to wake up this morning with a mess of mis-marked
> messages due to hits on hostkarma. Until I can do further
> analysis, I've dropped RCVD_IN_HOSTKARMA_BL and
> RCVD_IN_HOSTKARMA_WL to .001 and -.001 respectively.
>
>

jason

maybe some of you folks do not have your SA systems trained properly...

out of a recent stats run of 12999 total emails....

TOP SPAM RULES FIRED
----------------------------------------------------------------------
RANK RULE NAME COUNT %OFMAIL %OFSPAM %OFHAM

----------------------------------------------------------------------

4 RCVD_IN_JMF_BL 3993 31.03 54.80 0.70

and

TOP HAM RULES FIRED
----------------------------------------------------------------------
RANK RULE NAME COUNT %OFMAIL %OFSPAM %OFHAM

----------------------------------------------------------------------

4 RCVD_IN_JMF_W 2763 22.67 2.53 48.36

we do not use high scores yet we do score accordingly...

- rh


marc at perkel

Oct 14, 2009, 3:25 AM

Post #5 of 39 (1262 views)
Permalink
Re: Hostkarma whitelist needs something.. [In reply to]

Jari Fredriksson wrote:
>
> I just started using Katz's wiki rules and it brought HOSTKARMA with it.
>
> I have not yet seen any blacklists of HOSTKARMA, but the whitelists are
> there.
>
> RANK RULE NAME COUNT %OFMAIL %OFSPAM %OFHAM
> 9 RCVD_IN_HOSTKARMA_WL 77 18.33 19.48 17.15
>
> Is this really a whitelist?
>
> I think it needs tuning. I do not remove it, as it does not appear in
> the SPAM list, but just wondering.
>
> Confused am I?
>
>

All I can say is that if these numbers were real or typical I would be
out of business.


hege at hege

Oct 14, 2009, 3:46 AM

Post #6 of 39 (1255 views)
Permalink
Re: Hostkarma whitelist needs something.. [In reply to]

On Wed, Oct 14, 2009 at 03:25:36AM -0700, Marc Perkel wrote:
>
>
> Jari Fredriksson wrote:
>>
>> I just started using Katz's wiki rules and it brought HOSTKARMA with it.
>>
>> I have not yet seen any blacklists of HOSTKARMA, but the whitelists are
>> there.
>>
>> RANK RULE NAME COUNT %OFMAIL %OFSPAM %OFHAM
>> 9 RCVD_IN_HOSTKARMA_WL 77 18.33 19.48 17.15
>>
>> Is this really a whitelist?
>>
>> I think it needs tuning. I do not remove it, as it does not appear in
>> the SPAM list, but just wondering.
>>
>> Confused am I?
>>
>>
>
> All I can say is that if these numbers were real or typical I would be
> out of business.

And how much business do you get outside of US? I think that's your
problem..


marc at perkel

Oct 14, 2009, 3:53 AM

Post #7 of 39 (1255 views)
Permalink
Re: Hostkarma whitelist needs something.. [In reply to]

Henrik K wrote:
On Wed, Oct 14, 2009 at 03:25:36AM -0700, Marc Perkel wrote:
Jari Fredriksson wrote:
I just started using Katz's wiki rules and it brought HOSTKARMA with it. I have not yet seen any blacklists of HOSTKARMA, but the whitelists are there. RANK RULE NAME COUNT %OFMAIL %OFSPAM %OFHAM 9 RCVD_IN_HOSTKARMA_WL 77 18.33 19.48 17.15 Is this really a whitelist? I think it needs tuning. I do not remove it, as it does not appear in the SPAM list, but just wondering. Confused am I?
All I can say is that if these numbers were real or typical I would be out of business.
And how much business do you get outside of US? I think that's your problem..

I get a lot from outside the US.


lists07 at abbacomm

Oct 14, 2009, 9:17 AM

Post #8 of 39 (1252 views)
Permalink
RE: Hostkarma whitelist needs something.. [In reply to]

> >
> >
>
> All I can say is that if these numbers were real or typical I
> would be out of business.
>

perkel,

i might be wrong, yet it doesnt appear to me that Jari have enough mail
volume to have a reasonable statistical base...

- rh


jarif at iki

Oct 14, 2009, 9:52 AM

Post #9 of 39 (1243 views)
Permalink
Re: Hostkarma whitelist needs something.. [In reply to]

14.10.2009 19:17, R-Elists kirjoitti:
>
>
>>>
>>>
>>
>> All I can say is that if these numbers were real or typical I
>> would be out of business.
>>
>
> perkel,
>
> i might be wrong, yet it doesnt appear to me that Jari have enough mail
> volume to have a reasonable statistical base...
>
> - rh
>

You are not wrong, this is a basically one person system. My HAM is
mostly mailing lists and bulk from newspapers. My SPAM is mostly old
useless email addresses that work kind of 'spam traps' as they are not
in other use but to collect SPAM. I also get and info@ for my earlier
employer, I filter SPAM out of it and forward it to the company. My
personal email for that company is also directed to me, and works as a
great SPAM source (Nobody uses it anymore but it attracts much spam as
it was used in Internet discussions while I worked there.

This is the latest sa-stats from cronjob before logrotate. Should be
email load for one day.

I'm pretty happy with my Bayes training, and then the BOTNET seem to
work nicely for me. But the HOSTKARMA_W* still makes one wonder. They
are both now in the TOP SPAM tops.

*********************************************************************

Email: 1333 Autolearn: 803 AvgScore: 11.93 AvgScanTime: 12.23 sec
Spam: 646 Autolearn: 541 AvgScore: 32.97 AvgScanTime: 10.58 sec
Ham: 687 Autolearn: 262 AvgScore: -7.85 AvgScanTime: 13.79 sec

Time Spent Running SA: 4.53 hours
Time Spent Processing Spam: 1.90 hours
Time Spent Processing Ham: 2.63 hours

TOP SPAM RULES FIRED
----------------------------------------------------------------------
RANK RULE NAME COUNT %OFMAIL %OFSPAM %OFHAM
----------------------------------------------------------------------
1 BAYES_99 625 46.89 96.75 0.00
2 DCC_CHECK 608 60.32 94.12 28.53
3 RAZOR2_CHECK 600 45.01 92.88 0.00
4 HTML_MESSAGE 599 49.74 92.72 9.32
5 RAZOR2_CF_RANGE_51_100 599 44.94 92.72 0.00
6 RCVD_IN_BRBL_RELAY 598 46.21 92.57 2.62
7 URIBL_BLACK 596 44.71 92.26 0.00
8 DIGEST_MULTIPLE 589 44.19 91.18 0.00
9 BOTNET 588 44.11 91.02 0.00
10 URIBL_SBL 563 42.31 87.15 0.15
11 RAZOR2_CF_RANGE_E8_51_100 540 40.51 83.59 0.00
12 URIBL_JP_SURBL 537 40.29 83.13 0.00
13 URIBL_WS_SURBL 504 37.81 78.02 0.00
14 URIBL_AB_SURBL 426 31.96 65.94 0.00
15 RDNS_NONE 393 29.63 60.84 0.29
16 RCVD_IN_BL_SPAMCOP_NET 392 29.41 60.68 0.00
17 RCVD_IN_HOSTKARMA_WL 359 50.49 55.57 45.71
18 RCVD_IN_HOSTKARMA_W 359 67.29 55.57 78.31
19 KHOP_RCVD_UNTRUST 359 59.19 55.57 62.59
20 RCVD_IN_PSBL 352 26.41 54.49 0.00
----------------------------------------------------------------------

TOP HAM RULES FIRED
----------------------------------------------------------------------
RANK RULE NAME COUNT %OFMAIL %OFSPAM %OFHAM
----------------------------------------------------------------------
1 BAYES_00 616 46.29 0.15 89.67
2 RCVD_IN_HOSTKARMA_W 538 67.29 55.57 78.31
3 AWL 506 41.26 6.81 73.65
4 KHOP_RCVD_UNTRUST 430 59.19 55.57 62.59
5 RCVD_IN_HOSTKARMA_WL 314 50.49 55.57 45.71
6 KHOP_HELO_FCRDNS 262 23.86 8.67 38.14
7 RCVD_IN_DNSWL_LOW 226 16.95 0.00 32.90
8 KHOP_NO_FULL_NAME 197 16.13 2.79 28.68
9 DCC_CHECK 196 60.32 94.12 28.53
10 RCVD_IN_DNSWL_HI 152 11.40 0.00 22.13
11 DKIM_SIGNED 144 11.78 2.01 20.96
12 RCVD_IN_DNSWL_MED 106 7.95 0.00 15.43
13 RCVD_IN_BSP_OTHER 65 4.88 0.00 9.46
14 HTML_MESSAGE 64 49.74 92.72 9.32
15 KHOP_RCVD_TRUST 59 4.43 0.00 8.59
16 DKIM_VERIFIED 39 3.00 0.15 5.68
17 KHOP_PGP_SIGNED 34 2.55 0.00 4.95
18 MIME_QP_LONG_LINE 34 3.60 2.17 4.95
19 KHOP_2IPS_RCVD 27 3.00 2.01 3.93
20 RCVD_IN_BRBL_RELAY 18 46.21 92.57 2.62
----------------------------------------------------------------------






--
http://www.iki.fi/jarif/

The human race is a race of cowards; and I am not only marching in that
procession but carrying a banner.
-- Mark Twain


rick_knight at rlknight

Oct 14, 2009, 10:33 AM

Post #10 of 39 (1242 views)
Permalink
Re: Hostkarma whitelist needs something.. [In reply to]

Jari Fredriksson wrote:
>
>
> 14.10.2009 19:17, R-Elists kirjoitti:
>>
>>
>>>>
>>>>
>>>
>>> All I can say is that if these numbers were real or typical I
>>> would be out of business.
>>>
>>
>> perkel,
>>
>> i might be wrong, yet it doesnt appear to me that Jari have enough mail
>> volume to have a reasonable statistical base...
>>
>> - rh
>>
>
> You are not wrong, this is a basically one person system. My HAM is
> mostly mailing lists and bulk from newspapers. My SPAM is mostly old
> useless email addresses that work kind of 'spam traps' as they are not
> in other use but to collect SPAM. I also get and info@ for my earlier
> employer, I filter SPAM out of it and forward it to the company. My
> personal email for that company is also directed to me, and works as a
> great SPAM source (Nobody uses it anymore but it attracts much spam as
> it was used in Internet discussions while I worked there.
>
> This is the latest sa-stats from cronjob before logrotate. Should be
> email load for one day.
>
> I'm pretty happy with my Bayes training, and then the BOTNET seem to
> work nicely for me. But the HOSTKARMA_W* still makes one wonder. They
> are both now in the TOP SPAM tops.
>
> *********************************************************************
>
> Email: 1333 Autolearn: 803 AvgScore: 11.93 AvgScanTime: 12.23 sec
> Spam: 646 Autolearn: 541 AvgScore: 32.97 AvgScanTime: 10.58 sec
> Ham: 687 Autolearn: 262 AvgScore: -7.85 AvgScanTime: 13.79 sec
>
> Time Spent Running SA: 4.53 hours
> Time Spent Processing Spam: 1.90 hours
> Time Spent Processing Ham: 2.63 hours
>
> TOP SPAM RULES FIRED
> ----------------------------------------------------------------------
> RANK RULE NAME COUNT %OFMAIL %OFSPAM %OFHAM
> ----------------------------------------------------------------------
> 1 BAYES_99 625 46.89 96.75 0.00
> 2 DCC_CHECK 608 60.32 94.12 28.53
> 3 RAZOR2_CHECK 600 45.01 92.88 0.00
> 4 HTML_MESSAGE 599 49.74 92.72 9.32
> 5 RAZOR2_CF_RANGE_51_100 599 44.94 92.72 0.00
> 6 RCVD_IN_BRBL_RELAY 598 46.21 92.57 2.62
> 7 URIBL_BLACK 596 44.71 92.26 0.00
> 8 DIGEST_MULTIPLE 589 44.19 91.18 0.00
> 9 BOTNET 588 44.11 91.02 0.00
> 10 URIBL_SBL 563 42.31 87.15 0.15
> 11 RAZOR2_CF_RANGE_E8_51_100 540 40.51 83.59 0.00
> 12 URIBL_JP_SURBL 537 40.29 83.13 0.00
> 13 URIBL_WS_SURBL 504 37.81 78.02 0.00
> 14 URIBL_AB_SURBL 426 31.96 65.94 0.00
> 15 RDNS_NONE 393 29.63 60.84 0.29
> 16 RCVD_IN_BL_SPAMCOP_NET 392 29.41 60.68 0.00
> 17 RCVD_IN_HOSTKARMA_WL 359 50.49 55.57 45.71
> 18 RCVD_IN_HOSTKARMA_W 359 67.29 55.57 78.31
> 19 KHOP_RCVD_UNTRUST 359 59.19 55.57 62.59
> 20 RCVD_IN_PSBL 352 26.41 54.49 0.00
> ----------------------------------------------------------------------
>
> TOP HAM RULES FIRED
> ----------------------------------------------------------------------
> RANK RULE NAME COUNT %OFMAIL %OFSPAM %OFHAM
> ----------------------------------------------------------------------
> 1 BAYES_00 616 46.29 0.15 89.67
> 2 RCVD_IN_HOSTKARMA_W 538 67.29 55.57 78.31
> 3 AWL 506 41.26 6.81 73.65
> 4 KHOP_RCVD_UNTRUST 430 59.19 55.57 62.59
> 5 RCVD_IN_HOSTKARMA_WL 314 50.49 55.57 45.71
> 6 KHOP_HELO_FCRDNS 262 23.86 8.67 38.14
> 7 RCVD_IN_DNSWL_LOW 226 16.95 0.00 32.90
> 8 KHOP_NO_FULL_NAME 197 16.13 2.79 28.68
> 9 DCC_CHECK 196 60.32 94.12 28.53
> 10 RCVD_IN_DNSWL_HI 152 11.40 0.00 22.13
> 11 DKIM_SIGNED 144 11.78 2.01 20.96
> 12 RCVD_IN_DNSWL_MED 106 7.95 0.00 15.43
> 13 RCVD_IN_BSP_OTHER 65 4.88 0.00 9.46
> 14 HTML_MESSAGE 64 49.74 92.72 9.32
> 15 KHOP_RCVD_TRUST 59 4.43 0.00 8.59
> 16 DKIM_VERIFIED 39 3.00 0.15 5.68
> 17 KHOP_PGP_SIGNED 34 2.55 0.00 4.95
> 18 MIME_QP_LONG_LINE 34 3.60 2.17 4.95
> 19 KHOP_2IPS_RCVD 27 3.00 2.01 3.93
> 20 RCVD_IN_BRBL_RELAY 18 46.21 92.57 2.62
> ----------------------------------------------------------------------
>
>
>
>
>
>
> --
> http://www.iki.fi/jarif/
>
> The human race is a race of cowards; and I am not only marching in that
> procession but carrying a banner.
> -- Mark Twain
Jari,

How did you produce the great looking statistics?

Thanks,
Rick


jarif at iki

Oct 14, 2009, 10:50 AM

Post #11 of 39 (1251 views)
Permalink
Re: Hostkarma whitelist needs something.. [In reply to]

14.10.2009 20:33, Rick Knight kirjoitti:
> Jari,
>
> How did you produce the great looking statistics?
>
> Thanks,
> Rick
>

It's a perl script called sa-stats.pl

I tried not google it for you, but could not find the original. Many
scripts with the same name though..

I put that to my server as http://www.iki.fi/jarif/sa/sa-stats.pl

I have modified the default file so that is scans /var/log/messages
which works for me (Debian), the script not runs without arguments.

The file can be given to it as a parameter too.

I have created the cron job as /etc/cron.daily/00a-sa-stats so that it
is run daily just before logrotate, and scans the latest logs before
they are rotated.


--
http://www.iki.fi/jarif/

Q: How many WASPs does it take to change a light bulb?
A: One.


jarif at iki

Oct 14, 2009, 10:55 AM

Post #12 of 39 (1250 views)
Permalink
Re: Hostkarma whitelist needs something.. [In reply to]

14.10.2009 20:50, Jari Fredriksson kirjoitti:
>
> which works for me (Debian), the script not runs without arguments.

*now* runs. (s/not/now/g)

--
http://www.iki.fi/jarif/

Q: How many WASPs does it take to change a light bulb?
A: One.


spamassassin-users at lists

Oct 14, 2009, 11:49 AM

Post #13 of 39 (1244 views)
Permalink
Re: Hostkarma whitelist needs something.. [In reply to]

Jari Fredriksson wrote:

>> Jari,
>>
>> How did you produce the great looking statistics?
>>
>> Thanks,
>> Rick
>>
>
> It's a perl script called sa-stats.pl
>
> I tried not google it for you, but could not find the original. Many
> scripts with the same name though..
>
> I put that to my server as http://www.iki.fi/jarif/sa/sa-stats.pl
>
> I have modified the default file so that is scans /var/log/messages
> which works for me (Debian), the script not runs without arguments.

That's a very nice script. I made one small change to it to make it work
with gzip compressed logs. I replaced:

open(F,"$log");

With:

open(F,$log=~/\.gz$/i?"zcat $log|":"$log");

Anyway, back to the JMF whitelist. I actually think it has improved in
quality recently. A while back it was triggering on a lot of spam that
it shouldn't have been, but it seems to happen a lot less now. I've just
run my last weeks worth of logs through that sa-stats.pl script and it
agrees with me:

TOP HAM RULES FIRED
----------------------------------------------------------------------
RANK RULE NAME COUNT %OFMAIL %OFSPAM %OFHAM

----------------------------------------------------------------------
1 RCVD_IN_JMF_W 1115 6.88 0.06 74.98
2 SPF_PASS 1084 7.48 0.94 72.90
3 BAYES_00 1070 6.55 0.00 71.96
4 HTML_MESSAGE 555 69.49 72.71 37.32
5 RCVD_IN_DNSWL_MED 452 2.77 0.00 30.40
6 DKIM_SIGNED 409 3.06 0.61 27.51
7 RCVD_IN_DNSWL_HI 383 2.34 0.00 25.76
8 HABEAS_ACCREDITED_SOI 308 1.88 0.00 20.71
9 RCVD_IN_BSP_TRUSTED 294 1.80 0.00 19.77
10 DKIM_VERIFIED 244 1.91 0.46 16.41
11 RCVD_IN_DNSWL_LOW 176 1.11 0.04 11.84

--
Mike Cardwell - IT Consultant and LAMP developer
Cardwell IT Ltd. (UK Reg'd Company #06920226) http://cardwellit.com/
Technical Blog: https://secure.grepular.com/blog/


jarif at iki

Oct 14, 2009, 12:03 PM

Post #14 of 39 (1243 views)
Permalink
Re: Hostkarma whitelist needs something.. [In reply to]

14.10.2009 21:49, Mike Cardwell kirjoitti:
> Jari Fredriksson wrote:
>
>>> Jari,
>>>
>>> How did you produce the great looking statistics?
>>>
>>> Thanks,
>>> Rick
>>>
>>
>> It's a perl script called sa-stats.pl
>>
>> I tried not google it for you, but could not find the original. Many
>> scripts with the same name though..
>>
>> I put that to my server as http://www.iki.fi/jarif/sa/sa-stats.pl
>>
>> I have modified the default file so that is scans /var/log/messages
>> which works for me (Debian), the script not runs without arguments.
>
> That's a very nice script. I made one small change to it to make it work
> with gzip compressed logs. I replaced:
>
> open(F,"$log");
>
> With:
>
> open(F,$log=~/\.gz$/i?"zcat $log|":"$log");
>

I took that modification into my copy, and updated the script in my
website too.

Works great, thanks!


jhardin at impsec

Oct 14, 2009, 12:18 PM

Post #15 of 39 (1245 views)
Permalink
Re: Hostkarma whitelist needs something.. [In reply to]

On Wed, 14 Oct 2009, Jari Fredriksson wrote:

> 14.10.2009 21:49, Mike Cardwell kirjoitti:
>> Jari Fredriksson wrote:
>>
>> > It's a perl script called sa-stats.pl
>> >
>> > I have modified the default file so that is scans /var/log/messages
>> > which works for me (Debian), the script not runs without arguments.
>>
>> That's a very nice script. I made one small change to it to make it work
>> with gzip compressed logs.
>
> I took that modification into my copy, and updated the script in my
> website too.

Heh. I did the same a few months back, and added a new column listing
average score for messages that rule hit. Two versions, one for the
current /var/log/maillog and one for /var/log/maillog*gz, are here:

http://www.impsec.org/jhardin/antispam/

--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin [at] impsec FALaholic #11174 pgpk -a jhardin [at] impsec
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
Windows Genuine Advantage (WGA) means that now you use your
computer at the sufferance of Microsoft Corporation. They can
kill it remotely without your consent at any time for any reason;
it also shuts down in sympathy when the servers at Microsoft crash.
-----------------------------------------------------------------------
13 days since a sunspot last seen - EPA blames CO2 emissions


rick_knight at rlknight

Oct 14, 2009, 2:03 PM

Post #16 of 39 (1249 views)
Permalink
Re: Hostkarma whitelist needs something.. [In reply to]

Jari Fredriksson wrote:
>
>
> 14.10.2009 21:49, Mike Cardwell kirjoitti:
>> Jari Fredriksson wrote:
>>
>>>> Jari,
>>>>
>>>> How did you produce the great looking statistics?
>>>>
>>>> Thanks,
>>>> Rick
>>>>
>>>
>>> It's a perl script called sa-stats.pl
>>>
>>> I tried not google it for you, but could not find the original. Many
>>> scripts with the same name though..
>>>
>>> I put that to my server as http://www.iki.fi/jarif/sa/sa-stats.pl
>>>
>>> I have modified the default file so that is scans /var/log/messages
>>> which works for me (Debian), the script not runs without arguments.
>>
>> That's a very nice script. I made one small change to it to make it work
>> with gzip compressed logs. I replaced:
>>
>> open(F,"$log");
>>
>> With:
>>
>> open(F,$log=~/\.gz$/i?"zcat $log|":"$log");
>>
>
> I took that modification into my copy, and updated the script in my
> website too.
>
> Works great, thanks!
>
Jari,

Thanks again for putting the script up. I found the original site and
downloaded the 3.1 version. In the instructions it says "If your top 5
does not contain URIBL_BLACK, see http://www.uribl.com/usage.shtml". On
my system URIBL_BLACK is number 8, so I follow the link and it takes me
to a page the has a ruleset that needs to be added to my local
configuration directory (/etc/mail/spamassassin). I don't appear to have
any local rulesets in my configuration directory to add them to. Can you
tell me how to add these rules?

Thanks again,
Rick


jarif at iki

Oct 14, 2009, 2:04 PM

Post #17 of 39 (1240 views)
Permalink
Re: Hostkarma whitelist needs something.. [In reply to]

14.10.2009 23:55, Rick Knight kirjoitti:
> Jari Fredriksson wrote:
>>
>>
>> 14.10.2009 21:49, Mike Cardwell kirjoitti:
>>> Jari Fredriksson wrote:
>>>
>>>>> Jari,
>>>>>
>>>>> How did you produce the great looking statistics?
>>>>>
>>>>> Thanks,
>>>>> Rick
>>>>>
>>>>
>>>> It's a perl script called sa-stats.pl
>>>>
>>>> I tried not google it for you, but could not find the original. Many
>>>> scripts with the same name though..
>>>>
>>>> I put that to my server as http://www.iki.fi/jarif/sa/sa-stats.pl
>>>>
>>>> I have modified the default file so that is scans /var/log/messages
>>>> which works for me (Debian), the script not runs without arguments.
>>>
>>> That's a very nice script. I made one small change to it to make it work
>>> with gzip compressed logs. I replaced:
>>>
>>> open(F,"$log");
>>>
>>> With:
>>>
>>> open(F,$log=~/\.gz$/i?"zcat $log|":"$log");
>>>
>>
>> I took that modification into my copy, and updated the script in my
>> website too.
>>
>> Works great, thanks!
>>
>
> Jari,
>
> Thanks again for putting the script up. I found the original site and
> downloaded the 3.1 version. In the instructions it says "If your top 5
> does not contain URIBL_BLACK, see http://www.uribl.com/usage.shtml". On
> my system URIBL_BLACK is number 8, so I follow the link and it takes me
> to a page the has a ruleset that needs to be added to my local
> configuration directory (/etc/mail/spamassassin). I don't appear to have
> any local rulesets in my configuration directory to add them to. Can you
> tell me how to add these rules?
>
> Thanks again,
> Rick

I don't know.

First things first! Please give me the URL of the original site and
latest version.

Then, we have to check SpamAssassin Users have to say about this. I have
URIBL_BLACK at row 9.

--
http://www.iki.fi/jarif/

You are fairminded, just and loving.


jarif at iki

Oct 14, 2009, 2:08 PM

Post #18 of 39 (1243 views)
Permalink
Re: Hostkarma whitelist needs something.. [In reply to]

14.10.2009 23:55, Rick Knight kirjoitti:
>
> Jari,
>
> Thanks again for putting the script up. I found the original site and
> downloaded the 3.1 version. In the instructions it says "If your top 5
> does not contain URIBL_BLACK, see http://www.uribl.com/usage.shtml". On
> my system URIBL_BLACK is number 8, so I follow the link and it takes me
> to a page the has a ruleset that needs to be added to my local
> configuration directory (/etc/mail/spamassassin). I don't appear to have
> any local rulesets in my configuration directory to add them to. Can you
> tell me how to add these rules?
>
> Thanks again,
> Rick

Let us try to keep this discussion on the list, please!

You can put those rules in the /etc/mail/spamassassin/local.cf OR any
.cf file in that folder. SA reads ALL .cf files from there and activates
them.

--
http://www.iki.fi/jarif/

You are fairminded, just and loving.


guenther at rudersport

Oct 14, 2009, 2:14 PM

Post #19 of 39 (1241 views)
Permalink
Re: Hostkarma whitelist needs something.. [In reply to]

On Wed, 2009-10-14 at 14:03 -0700, Rick Knight wrote:
> Thanks again for putting the script up. I found the original site and
> downloaded the 3.1 version. In the instructions it says "If your top 5
> does not contain URIBL_BLACK, see http://www.uribl.com/usage.shtml". On
> my system URIBL_BLACK is number 8, so I follow the link and it takes me
^^^^^^^^^^^^^^^^^^^^^^^
> to a page the has a ruleset that needs to be added to my local
> configuration directory (/etc/mail/spamassassin). I don't appear to have
> any local rulesets in my configuration directory to add them to. Can you
> tell me how to add these rules?

You do not need to add anything. It *is* part of your SA rules. Frankly,
it is part of the stock rules, so it's not much of a surprise you do
have it already...

The script and that comment is old. Everyone's spam is different. So if
it comes in 8th, that should be fine as well [1].


[1] Even though I would have expected it higher up, without any custom
rules.

--
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


rick_knight at rlknight

Oct 14, 2009, 2:21 PM

Post #20 of 39 (1250 views)
Permalink
Re: Hostkarma whitelist needs something.. [In reply to]

Jari Fredriksson wrote:
>
> 14.10.2009 23:55, Rick Knight kirjoitti:
>> Jari Fredriksson wrote:
>>>
>>>
>>> 14.10.2009 21:49, Mike Cardwell kirjoitti:
>>>> Jari Fredriksson wrote:
>>>>
>>>>>> Jari,
>>>>>>
>>>>>> How did you produce the great looking statistics?
>>>>>>
>>>>>> Thanks,
>>>>>> Rick
>>>>>>
>>>>>
>>>>> It's a perl script called sa-stats.pl
>>>>>
>>>>> I tried not google it for you, but could not find the original. Many
>>>>> scripts with the same name though..
>>>>>
>>>>> I put that to my server as http://www.iki.fi/jarif/sa/sa-stats.pl
>>>>>
>>>>> I have modified the default file so that is scans /var/log/messages
>>>>> which works for me (Debian), the script not runs without arguments.
>>>>
>>>> That's a very nice script. I made one small change to it to make it
>>>> work
>>>> with gzip compressed logs. I replaced:
>>>>
>>>> open(F,"$log");
>>>>
>>>> With:
>>>>
>>>> open(F,$log=~/\.gz$/i?"zcat $log|":"$log");
>>>>
>>>
>>> I took that modification into my copy, and updated the script in my
>>> website too.
>>>
>>> Works great, thanks!
>>>
>>
>> Jari,
>>
>> Thanks again for putting the script up. I found the original site and
>> downloaded the 3.1 version. In the instructions it says "If your top 5
>> does not contain URIBL_BLACK, see http://www.uribl.com/usage.shtml". On
>> my system URIBL_BLACK is number 8, so I follow the link and it takes me
>> to a page the has a ruleset that needs to be added to my local
>> configuration directory (/etc/mail/spamassassin). I don't appear to have
>> any local rulesets in my configuration directory to add them to. Can you
>> tell me how to add these rules?
>>
>> Thanks again,
>> Rick
>
> I don't know.
>
> First things first! Please give me the URL of the original site and
> latest version.
>
> Then, we have to check SpamAssassin Users have to say about this. I have
> URIBL_BLACK at row 9.
>
> --
> http://www.iki.fi/jarif/
>
> You are fairminded, just and loving.
Jari,

Here's the url for the newer version
"http://www.rulesemporium.com/programs/sa-stats-1.0.txt".

Here's the URIBL site with the URIBL_BLACK ruleset
"http://www.uribl.com/usage.shtml".

Thanks,
Rick


rick_knight at rlknight

Oct 14, 2009, 2:24 PM

Post #21 of 39 (1254 views)
Permalink
Re: Hostkarma whitelist needs something.. [In reply to]

Karsten Bräckelmann wrote:
> On Wed, 2009-10-14 at 14:03 -0700, Rick Knight wrote:
>
>> Thanks again for putting the script up. I found the original site and
>> downloaded the 3.1 version. In the instructions it says "If your top 5
>> does not contain URIBL_BLACK, see http://www.uribl.com/usage.shtml". On
>> my system URIBL_BLACK is number 8, so I follow the link and it takes me
>>
> ^^^^^^^^^^^^^^^^^^^^^^^
>
>> to a page the has a ruleset that needs to be added to my local
>> configuration directory (/etc/mail/spamassassin). I don't appear to have
>> any local rulesets in my configuration directory to add them to. Can you
>> tell me how to add these rules?
>>
>
> You do not need to add anything. It *is* part of your SA rules. Frankly,
> it is part of the stock rules, so it's not much of a surprise you do
> have it already...
>
> The script and that comment is old. Everyone's spam is different. So if
> it comes in 8th, that should be fine as well [1].
>
>
> [1] Even though I would have expected it higher up, without any custom
> rules.
>
>
Thanks Karsten.


jarif at iki

Oct 14, 2009, 2:26 PM

Post #22 of 39 (1248 views)
Permalink
Re: Hostkarma whitelist needs something.. [In reply to]

15.10.2009 0:21, Rick Knight kirjoitti:
>
> Here's the url for the newer version
> "http://www.rulesemporium.com/programs/sa-stats-1.0.txt".
>

It's not newer, it's exactly the same version, without or later additions.

Please to not reply to my INBOX, I get confused. My Thunderbird sees
your replies as separate for List and my email.

Let's keep this on list only, please. Remove my email from addresses if
you reply.

--
http://www.iki.fi/jarif/

Beware of low-flying butterflies.


rick_knight at rlknight

Oct 14, 2009, 2:42 PM

Post #23 of 39 (1241 views)
Permalink
Re: Hostkarma whitelist needs something.. [In reply to]

Jari Fredriksson wrote:
>
>
> 15.10.2009 0:21, Rick Knight kirjoitti:
>>
>> Here's the url for the newer version
>> "http://www.rulesemporium.com/programs/sa-stats-1.0.txt".
>>
>
> It's not newer, it's exactly the same version, without or later
> additions.
>
> Please to not reply to my INBOX, I get confused. My Thunderbird sees
> your replies as separate for List and my email.
>
> Let's keep this on list only, please. Remove my email from addresses if
> you reply.
>
> --
> http://www.iki.fi/jarif/
>
> Beware of low-flying butterflies.
Sorry for the confusion. I think I'm a little confused now too. I
thought the version I got from you had instructions that indicated it
was for SA versions prior to 3.1 and included a link to a 3.1 version.
Maybe I got it backwards. If I did, I apologize.

Thanks again,
Rick


jarif at iki

Oct 14, 2009, 2:47 PM

Post #24 of 39 (1248 views)
Permalink
Re: Hostkarma whitelist needs something.. [In reply to]

15.10.2009 0:42, Rick Knight kirjoitti:
> Jari Fredriksson wrote:
>>
>>
>> 15.10.2009 0:21, Rick Knight kirjoitti:
>>>
>>> Here's the url for the newer version
>>> "http://www.rulesemporium.com/programs/sa-stats-1.0.txt".
>>>
>>
>> It's not newer, it's exactly the same version, without or later
>> additions.
>>
>> Please to not reply to my INBOX, I get confused. My Thunderbird sees
>> your replies as separate for List and my email.
>>
>> Let's keep this on list only, please. Remove my email from addresses if
>> you reply.
>>
>> --
>> http://www.iki.fi/jarif/
>>
>> Beware of low-flying butterflies.
> Sorry for the confusion. I think I'm a little confused now too. I
> thought the version I got from you had instructions that indicated it
> was for SA versions prior to 3.1 and included a link to a 3.1 version.
> Maybe I got it backwards. If I did, I apologize.
>

I think the version from that link you provided contains exactly the
same message. It is the same 1.03 version of the script.

Or I'm too drunk, which I probably am.



--
http://www.iki.fi/jarif/

Beware of low-flying butterflies.


jarif at iki

Oct 14, 2009, 4:44 PM

Post #25 of 39 (1239 views)
Permalink
Re: Hostkarma whitelist needs something.. [In reply to]

14.10.2009 13:25, Marc Perkel kirjoitti:
>
>
> Jari Fredriksson wrote:
>>
>> I just started using Katz's wiki rules and it brought HOSTKARMA with it.
>>
>> I have not yet seen any blacklists of HOSTKARMA, but the whitelists are
>> there.
>>
>> RANK RULE NAME COUNT %OFMAIL %OFSPAM %OFHAM
>> 9 RCVD_IN_HOSTKARMA_WL 77 18.33 19.48 17.15
>>
>> Is this really a whitelist?
>>
>> I think it needs tuning. I do not remove it, as it does not appear in
>> the SPAM list, but just wondering.
>>
>> Confused am I?
>>
>>
>
> All I can say is that if these numbers were real or typical I would be
> out of business.
>

I can post You may SPAM corpus in a tar file, if it helps you. But SPAM
us SPAM. It is Viagra and Rolex, nothin more.

Here is a late sample:

Content preview: Creating the image of a self-confident and independent
person will help you to achieve great results in your future business.
You will create such an image owing replica Swiss watches presented at
our site! &nbsp;
[...]

Content analysis details: (44.7 points, 5.0 required)

pts rule name description
---- ----------------------
--------------------------------------------------
-1.0 RCVD_IN_HOSTKARMA_W RBL: HostKarma: relay in white list (first pass)
[212.16.98.53 listed in
hostkarma.junkemailfilter.com]
3.0 RCVD_IN_BRBL_RELAY RBL: received via a relay rated as poor by
Barracuda
[187.10.243.78 listed in
bb.barracudacentral.org]
2.0 URIBL_BLACK Contains an URL listed in the URIBL blacklist
[URIs: teimcsku.cn]
1.9 URIBL_AB_SURBL Contains an URL listed in the AB SURBL
blocklist
[URIs: teimcsku.cn]
1.5 URIBL_WS_SURBL Contains an URL listed in the WS SURBL
blocklist
[URIs: teimcsku.cn]
1.5 URIBL_JP_SURBL Contains an URL listed in the JP SURBL
blocklist
[URIs: teimcsku.cn]
1.5 URIBL_OB_SURBL Contains an URL listed in the OB SURBL
blocklist
[URIs: teimcsku.cn]
5.0 BAYES_99 BODY: Bayesian spam probability is 99 to 100%
[score: 1.0000]
1.8 SARE_SPEC_REPLICA_OBFU BODY: Rolex with obfuscated replica
4.3 HELO_DYNAMIC_HCC Relay HELO'd using suspicious hostname (HCC)
4.4 HELO_DYNAMIC_IPADDR2 Relay HELO'd using suspicious hostname (IP addr
2)
0.0 FH_HELO_EQ_D_D_D_D Helo is d-d-d-d
0.7 SARE_RECV_IP_FROMIP3 Received line is IP address from IP address
0.2 KHOP_SC_CIDR8 Relay listed in SpamCop top 8 IP/8 CIDRs
0.5 RELAY_BR Relayed through Brazil
1.5 URIBL_SBL Contains an URL listed in the SBL blocklist
[URIs: teimcsku.cn]
4.0 BOTNET Relay might be a spambot or virusbot
[botnet0.8,ip=187.10.243.78,rdns=187-10-243-78.dsl.telesp.net.br,maildomain=resdat.com,baddns,client,ipinhostname,clientwords]
1.0 HTML_MESSAGE BODY: HTML included in message
1.5 RAZOR2_CF_RANGE_E8_51_100 Razor2 gives engine 8 confidence level
above 50%
[cf: 100]
0.5 RAZOR2_CHECK Listed in Razor2 (http://razor.sf.net/)
1.5 RAZOR2_CF_RANGE_E4_51_100 Razor2 gives engine 4 confidence level
above 50%
[cf: 100]
0.5 RAZOR2_CF_RANGE_51_100 Razor2 gives confidence level above 50%
[cf: 100]
2.2 DCC_CHECK Listed in DCC
(http://rhyolite.com/anti-spam/dcc/)
1.7 SARE_RECV_SPAM_DOMN02 Email passed through apparent spammer domain
1.0 DIGEST_MULTIPLE Message hits more than one network digest check
-0.0 RCVD_IN_HOSTKARMA_WL RBL: HostKarma: unique whitelisted
0.1 RDNS_NONE Delivered to trusted network by a host with
no rDNS
-3.0 KHOP_URIBL_ADJ Undo autokill from URIBL overlap
4.0 JM_SOUGHT_3 Body contains frequently-spammed text patterns
1.0 KHOP_RCVD_UNTRUST DNS-whitelisted sender is not verified



--
http://www.iki.fi/jarif/

Write yourself a threatening letter and pen a defiant reply.

First page Previous page 1 2 Next page Last page  View All SpamAssassin users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.