Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: SpamAssassin: users

Mismarked Ham

 

 

SpamAssassin users RSS feed   Index | Next | Previous | View Threaded


mysqlstudent at gmail

Oct 14, 2009, 6:40 PM

Post #1 of 7 (742 views)
Permalink
Mismarked Ham

Hi,

I thought I would look through the quarantine for "BAYES_00" to see if
there were any mis-marked messages or if bayes was not firing
correctly, and I have found a few, although not how I expected it
would be.

Instead of finding BAYES_00 in spam, I've found it in ham that was
pushed over the threshold to spam because of other rules. Here are the
headers from one such instance:

http://pastebin.com/m6c3cd5e3

"exxample.com" is my obfuscation. It was an HTML email with two small
GIF attachments that were a basic background image and two links to
youtube videos of a religious Muslim ceremony in Arabic with English
subtitles. All indications are that bayes is correct and it's ham.

Which rule(s) is then incorrect? What is the right solution here? Is
the only option to whitelist the user?

Thanks,
Alex


kremels at kreme

Oct 14, 2009, 6:54 PM

Post #2 of 7 (698 views)
Permalink
Re: Mismarked Ham [In reply to]

On 14-Oct-2009, at 19:40, MySQL Student wrote:
> Which rule(s) is then incorrect? What is the right solution here? Is
> the only option to whitelist the user?


What makes you think any of the rules are incorrect? A score of 6.1 is
not 100% (or even 99%, IIRC) spam.

your spam test were:

X-Spam-Status: Yes, hits=6.1 tag1=-300.0 tag2=5.0 kill=5.0
use_bayes=1 tests=BAYES_00, DKIM_SIGNED, EXTRA_MPART_TYPE,
FREEMAIL_FROM,
HTML_MESSAGE, L_UNVERIFIED_GMAIL, PART_CID_STOCK, RELAYCOUNTRY_HIGH,
RELAYCOUNTRY_US, SPF_HELO_PASS, SPF_PASS, TVD_FW_GRAPHIC_NAME_LONG,
T_TVD_FW_GRAPHIC_ID1

there's a couple of things here.

First, for some reason you have DKIM_SIGNED but not DKIM_VERIFIED,
which seems odd as this looks like a legit gmail message with a legit
DKIM signature. So there's one thing to check.

I'm not sure which of those scored what. Then there is the fact that
your custom rule "L_UNVERIFIED_GMAIL" hit. If that's the same rule I
see in the list archives, that scored 2.5 and pushed this email firmly
into being tagged as spam.

Maybe adjust that score, or adjust the assumptions that caused that
rule to be added to your config?

This IS a gmail message, right? So your unverified-gmail custom rule
is in error.

--
Penny! *Everything* is better with BlueTooth


guenther at rudersport

Oct 14, 2009, 7:35 PM

Post #3 of 7 (696 views)
Permalink
Re: Mismarked Ham [In reply to]

On Wed, 2009-10-14 at 19:54 -0600, LuKreme wrote:
> On 14-Oct-2009, at 19:40, MySQL Student wrote:
> > Which rule(s) is then incorrect? What is the right solution here? Is
> > the only option to whitelist the user?

> your spam test were:
>
> X-Spam-Status: Yes, hits=6.1 tag1=-300.0 tag2=5.0 kill=5.0
> use_bayes=1 tests=BAYES_00, DKIM_SIGNED, EXTRA_MPART_TYPE,FREEMAIL_FROM,
> HTML_MESSAGE, L_UNVERIFIED_GMAIL, PART_CID_STOCK, RELAYCOUNTRY_HIGH,
> RELAYCOUNTRY_US, SPF_HELO_PASS, SPF_PASS, TVD_FW_GRAPHIC_NAME_LONG,
> T_TVD_FW_GRAPHIC_ID1
>
> there's a couple of things here.

> I'm not sure which of those scored what. [...]

Seconded. I do see quite a few custom rules. How much did they score?

Even more strange, there is a T_ prefixed rule, which of course is not
stock. And generally used for NON-published rules still in evaluation.
How did that one end up in there? What does it score?

No full report headers, no scores. Nothing we can possibly say.


--
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


mysqlstudent at gmail

Oct 14, 2009, 7:40 PM

Post #4 of 7 (702 views)
Permalink
Re: Mismarked Ham [In reply to]

Hi,

> What makes you think any of the rules are incorrect? A score of 6.1 is not
> 100% (or even 99%, IIRC) spam.

Incorrect in that at least one of the rules fired when they should not
have, making the valid email to be marked as spam.

> there's a couple of things here.
>
> First, for some reason you have DKIM_SIGNED but not DKIM_VERIFIED, which
> seems odd as this looks like a legit gmail message with a legit DKIM
> signature. So there's one thing to check.

Why is that? How do I go about figuring that out?

> I'm not sure which of those scored what. Then there is the fact that your
> custom rule  "L_UNVERIFIED_GMAIL" hit. If that's the same rule I see in the
> list archives, that scored 2.5 and pushed this email firmly into being
> tagged as spam.

Yes, that looks like it. It was posted by Dan McDonald on August 25th
to the list. It's a meta:

meta L_UNVERIFIED_GMAIL !DKIM_VERIFIED && __L_FROM_GMAIL && !__L_VIA_ML
priority L_UNVERIFIED_GMAIL 500
score L_UNVERIFIED_GMAIL 2.5

I've set it to 0.5 for now. Ideas on tracking down the DKIM_VERIFIED
issue would be appreciated.

> Maybe adjust that score, or adjust the assumptions that caused that rule to
> be added to your config?
>
> This IS a gmail message, right? So your unverified-gmail custom rule is in
> error.

Yes, that's correct. I think you've identified the root of the problem.

Thanks so much.
Best regards,
Alex


mysqlstudent at gmail

Oct 14, 2009, 8:21 PM

Post #5 of 7 (694 views)
Permalink
Re: Mismarked Ham [In reply to]

Hi,

>> I'm not sure which of those scored what. [...]
>
> Seconded. I do see quite a few custom rules. How much did they score?

My apologies; I hadn't realized so much of it was non-standard. It's
otherwise obviously not very possible to help without knowing what the
rules are for if you haven't seen them. I've re-run the spam through
SA. It looks like the bayes score has now changed, now making the
score 8.2. I've also reduced the L_UNVERIFIED_GMAIL down to 0.5 from
2.5.

X-Spam-Report:
* 2.0 RELAYCOUNTRY_HIGH Relayed by a country thats a bad spam source
* 0.0 RELAYCOUNTRY_US Relayed through United States
* 1.0 EXTRA_MPART_TYPE Header has extraneous
Content-type:...type= entry
* 0.5 FREEMAIL_FROM Sender email is freemail
(learnlivelove[at]gmail.com)
* -0.0 SPF_PASS SPF: sender matches SPF record
* -0.0 SPF_HELO_PASS SPF: HELO matches SPF record
* 0.0 DKIM_SIGNED Domain Keys Identified Mail: message has a signature
* 0.0 BAYES_50 BODY: Bayesian spam probability is 40 to 60%
* [score: 0.5000]
* 1.1 TVD_FW_GRAPHIC_NAME_LONG BODY: TVD_FW_GRAPHIC_NAME_LONG
* 0.0 HTML_MESSAGE BODY: HTML included in message
* 0.0 T_TVD_FW_GRAPHIC_ID1 BODY: T_TVD_FW_GRAPHIC_ID1
* 1.4 SARE_GIF_ATTACH FULL: Email has a inline gif
* 1.6 PART_CID_STOCK Has a spammy image attachment (by Content-ID)
* 0.5 L_UNVERIFIED_GMAIL L_UNVERIFIED_GMAIL

Should SARE_GIF_ATTACH be such a high value by default?

full SARE_GIF_ATTACH /name=\"?[0-9a-z._\-]{3,18}\.gif\"?/i
describe SARE_GIF_ATTACH Email has a inline gif
score SARE_GIF_ATTACH 1.42

I think this one might also be too aggressive by default?

meta PART_CID_STOCK
(__ANY_IMAGE_ATTACH&&__PART_STOCK_CID&&!__PART_STOCK_CL&&!__PART_STOCK_CD_F)
describe PART_CID_STOCK Has a spammy image attachment (by Content-ID)

> Even more strange, there is a T_ prefixed rule, which of course is not
> stock. And generally used for NON-published rules still in evaluation.
> How did that one end up in there? What does it score?

That originated in updates_spamassassin_org/72_active.cf, so it's part
of the channel updates:

mimeheader T_TVD_FW_GRAPHIC_ID1 Content-Id =~
/<[0-9a-f]{12}(?:\$[0-9a-f]{8}){2}\@/

Thanks,
Alex


uhlar at fantomas

Oct 15, 2009, 12:19 AM

Post #6 of 7 (700 views)
Permalink
Re: Mismarked Ham [In reply to]

> On 14-Oct-2009, at 19:40, MySQL Student wrote:
>> Which rule(s) is then incorrect? What is the right solution here? Is
>> the only option to whitelist the user?

On 14.10.09 19:54, LuKreme wrote:
> What makes you think any of the rules are incorrect? A score of 6.1 is
> not 100% (or even 99%, IIRC) spam.
>
> your spam test were:
>
> X-Spam-Status: Yes, hits=6.1 tag1=-300.0 tag2=5.0 kill=5.0
> use_bayes=1 tests=BAYES_00, DKIM_SIGNED, EXTRA_MPART_TYPE,
> FREEMAIL_FROM,
> HTML_MESSAGE, L_UNVERIFIED_GMAIL, PART_CID_STOCK, RELAYCOUNTRY_HIGH,
> RELAYCOUNTRY_US, SPF_HELO_PASS, SPF_PASS, TVD_FW_GRAPHIC_NAME_LONG,
> T_TVD_FW_GRAPHIC_ID1
>
> there's a couple of things here.
>
> First, for some reason you have DKIM_SIGNED but not DKIM_VERIFIED, which
> seems odd as this looks like a legit gmail message with a legit DKIM
> signature. So there's one thing to check.

I think there was problem with the DKIM package in the past, resulting to
exactly this problem. OP should upgrade his Mail::DKIM module.

--
Matus UHLAR - fantomas, uhlar [at] fantomas ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Fighting for peace is like fucking for virginity...


uhlar at fantomas

Oct 15, 2009, 12:20 AM

Post #7 of 7 (695 views)
Permalink
Re: Mismarked Ham [In reply to]

> > What makes you think any of the rules are incorrect? A score of 6.1 is not
> > 100% (or even 99%, IIRC) spam.

On 14.10.09 22:40, MySQL Student wrote:
> Incorrect in that at least one of the rules fired when they should not
> have, making the valid email to be marked as spam.

Or maybe they didn't fire when they should have.
Or maybe the scores are not properly set.

However I advised you to upgrade Mail::DKIM....
--
Matus UHLAR - fantomas, uhlar [at] fantomas ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Save the whales. Collect the whole set.

SpamAssassin users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.