Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: SpamAssassin: users

spamassassin rule set issue

 

 

SpamAssassin users RSS feed   Index | Next | Previous | View Threaded


parakrama1282 at gmail

Apr 17, 2012, 5:03 AM

Post #1 of 12 (601 views)
Permalink
spamassassin rule set issue

Hi.. guys

i have following rule in place in spamassassin,

rawbody BLOCK_RULE2 /(\W|^)Orange(\W|^)/i
score BLOCK_RULE2 50
describe BLOCK_RULE2 Bad Word

but one of my mails got blocked even-though its doesn't have word
"Orange" , but when search via the mail spamassassin show mail has
word Orange by displaying following.but that mail have words like
"Orangicat"

pts rule name description
---- ---------------------- --------------------------------------------------
-0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low
trust
[209.85.217.180 listed in list.dnswl.org]
-0.0 SPF_PASS SPF: sender matches SPF record
0.0 WEIRD_PORT URI: Uses non-standard port number for HTTP
0.0 HTML_MESSAGE BODY: HTML included in message
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
50 BLOCK_RULE2 RAW: Bad Word
X-Spam-Flag: YES




Any idea why this is happening ?

Thank You
Dhanushka


thomas.kinghorn at gmail

Apr 17, 2012, 5:09 AM

Post #2 of 12 (577 views)
Permalink
Re: spamassassin rule set issue [In reply to]

On 17/04/2012 14:03, dhanushka ranasinghe wrote:
>
> Any idea why this is happening ?
>
> Thank You
> Dhanushka
>

Try

/^Orange$/i

The $ specifies end of the word.

Regards
Tom


swatir88 at gmail

Apr 17, 2012, 5:14 AM

Post #3 of 12 (569 views)
Permalink
Re: spamassassin rule set issue [In reply to]

On Tue, Apr 17, 2012 at 5:39 PM, Tom Kinghorn <thomas.kinghorn [at] gmail>wrote:

> On 17/04/2012 14:03, dhanushka ranasinghe wrote:
>
>>
>> Any idea why this is happening ?
>>
>> Thank You
>> Dhanushka
>>
>>
> Try
>
> /^Orange$/i
>
> The $ specifies end of the word.
>
> Regards
> Tom
>

I think, this should work :

/\bOrange\b/i

Regards,
Swati


parakrama1282 at gmail

Apr 17, 2012, 5:18 AM

Post #4 of 12 (572 views)
Permalink
Re: spamassassin rule set issue [In reply to]

Hi.. guys..

I don't think regex is the issue , i tested the /(\W|^)Orange(\W|^)/i
its correctly doing the exact word match


Thank You
Dhanushka

On 17 April 2012 17:44, Swati R <swatir88 [at] gmail> wrote:
>
>
> On Tue, Apr 17, 2012 at 5:39 PM, Tom Kinghorn <thomas.kinghorn [at] gmail>
> wrote:
>>
>> On 17/04/2012 14:03, dhanushka ranasinghe wrote:
>>>
>>>
>>> Any idea why this is happening  ?
>>>
>>> Thank You
>>> Dhanushka
>>>
>>
>> Try
>>
>> /^Orange$/i
>>
>> The $ specifies end of the word.
>>
>> Regards
>> Tom
>
>
> I think, this should work :
>
> /\bOrange\b/i
>
> Regards,
> Swati


swatir88 at gmail

Apr 17, 2012, 5:37 AM

Post #5 of 12 (578 views)
Permalink
Re: spamassassin rule set issue [In reply to]

Try testing below rules, if you are trying to flag the mails containing the
exact 'orange' word only and not other such as orangecat.

Rest will depend upon requirement.

Thanks,
Swati

On Tue, Apr 17, 2012 at 5:48 PM, dhanushka ranasinghe <
parakrama1282 [at] gmail> wrote:

> Hi.. guys..
>
> I don't think regex is the issue , i tested the /(\W|^)Orange(\W|^)/i
> its correctly doing the exact word match
>
>
> Thank You
> Dhanushka
>
> On 17 April 2012 17:44, Swati R <swatir88 [at] gmail> wrote:
> >
> >
> > On Tue, Apr 17, 2012 at 5:39 PM, Tom Kinghorn <thomas.kinghorn [at] gmail
> >
> > wrote:
> >>
> >> On 17/04/2012 14:03, dhanushka ranasinghe wrote:
> >>>
> >>>
> >>> Any idea why this is happening ?
> >>>
> >>> Thank You
> >>> Dhanushka
> >>>
> >>
> >> Try
> >>
> >> /^Orange$/i
> >>
> >> The $ specifies end of the word.
> >>
> >> Regards
> >> Tom
> >
> >
> > I think, this should work :
> >
> > /\bOrange\b/i
> >
> > Regards,
> > Swati
>


thomas.kinghorn at gmail

Apr 17, 2012, 5:39 AM

Post #6 of 12 (571 views)
Permalink
Re: spamassassin rule set issue [In reply to]

On 17/04/2012 14:18, dhanushka ranasinghe wrote:
> Hi.. guys..
>
> I don't think regex is the issue , i tested the /(\W|^)Orange(\W|^)/i
> its correctly doing the exact word match
>
>
> Thank You
> Dhanushka
>

Firstly, please do not "top post"

Secondly, I disagree with you completely.....

The ^ (carat) indicates "start of the word", so why have it at the end???


rwmaillists at googlemail

Apr 17, 2012, 6:05 AM

Post #7 of 12 (571 views)
Permalink
Re: spamassassin rule set issue [In reply to]

On Tue, 17 Apr 2012 14:39:41 +0200
Tom Kinghorn wrote:

> On 17/04/2012 14:18, dhanushka ranasinghe wrote:
> > Hi.. guys..
> >
> > I don't think regex is the issue , i tested
> > the /(\W|^)Orange(\W|^)/i its correctly doing the exact word match
> >
> >
> > Thank You
> > Dhanushka
> >
>
> Firstly, please do not "top post"
>
> Secondly, I disagree with you completely.....
>
> The ^ (carat) indicates "start of the word", so why have it at the
> end???

It is pretty irrelevant. Clearly it should have
been /(\W|^)Orange(\W|$)/i, or simply /\bOrange\b/i, but that would only
make it fail to fire in a minority of cases, and the problem here is
that it's an FP.

I wonder if perhaps spamd (or whatever daemon is being used) has an
older version and needs to be restarted.


martin at gregorie

Apr 17, 2012, 6:15 AM

Post #8 of 12 (578 views)
Permalink
Re: spamassassin rule set issue [In reply to]

On Tue, 2012-04-17 at 14:39 +0200, Tom Kinghorn wrote:
> On 17/04/2012 14:18, dhanushka ranasinghe wrote:
> > Hi.. guys..
> >
> > I don't think regex is the issue , i tested the /(\W|^)Orange(\W|^)/i
> > its correctly doing the exact word match
> >
> >
> > Thank You
> > Dhanushka
> >
>
> Firstly, please do not "top post"
>
> Secondly, I disagree with you completely.....
>
> The ^ (carat) indicates "start of the word", so why have it at the end???
>
Indeed, and /^Orange$/i will only match Orange if it is the entire line.
In fact, as SA converts each paragraph into one long line in body rules,
it will only match a paragraph containing just the word 'Orange'.

/\borange\b/i is what I'd use.


Martin


thomas.kinghorn at gmail

Apr 17, 2012, 6:18 AM

Post #9 of 12 (574 views)
Permalink
Re: spamassassin rule set issue [In reply to]

On 17/04/2012 15:15, Martin Gregorie wrote:
> On Tue, 2012-04-17 at 14:39 +0200, Tom Kinghorn wrote:
> Indeed, and /^Orange$/i will only match Orange if it is the entire line.
> In fact, as SA converts each paragraph into one long line in body rules,
> it will only match a paragraph containing just the word 'Orange'.
>
> /\borange\b/i is what I'd use.
>
>
> Martin
>
>
Noted.

Thanks Martin.


martin at gregorie

Apr 17, 2012, 8:00 AM

Post #10 of 12 (559 views)
Permalink
Re: spamassassin rule set issue [In reply to]

On Tue, 2012-04-17 at 15:18 +0200, Tom Kinghorn wrote:

> > /\borange\b/i is what I'd use.
> >
>
I should have added that the latest versions of grep understand Perl
regex syntax, which can be useful for rapidly checking regexes before
writing an SA rule. The main difference is that the regex should be
enclosed in single quotes rather than forward slashes and the 'm' prefix
used by Perl to change the regex delimiters isn't understood and nor is
the /../i suffix. For example I was able to very rapidly run through the
suggestions for this case by using something like

grep -iP '\bOrange\b' <words.txt

where the -P option says that the regex is in Perl syntax, the -i option
sets case insensitivity and word.txt contains:

a line
Orange
an Orange
a drink of Orangeade now
a final line

Beware that the grep man page says "This is highly experimental and
grep -P may warn of unimplemented features." IOW using grep is only the
first step in developing a rule. You should still check the completed SA
rule against both ham, spam and (preferably) edge cases to make sure it
does no more and no less than you want it to do.


Martin





> >
> > Martin
> >
> >
> Noted.
>
> Thanks Martin.


Bowie_Bailey at BUC

Apr 17, 2012, 9:19 AM

Post #11 of 12 (563 views)
Permalink
Re: spamassassin rule set issue [In reply to]

On 4/17/2012 8:03 AM, dhanushka ranasinghe wrote:
> Hi.. guys
>
> i have following rule in place in spamassassin,
>
> rawbody BLOCK_RULE2 /(\W|^)Orange(\W|^)/i
> score BLOCK_RULE2 50
> describe BLOCK_RULE2 Bad Word
>
> but one of my mails got blocked even-though its doesn't have word
> "Orange" , but when search via the mail spamassassin show mail has
> word Orange by displaying following.but that mail have words like
> "Orangicat"

Some good suggestions here already. While your original regexp should
have worked in most cases, the optimal regexp for this situation is:

/\borange\b/i

(as has been noted previously)

If you are still having problems with false positives, please post the
exact rule you are using and put one or two samples of emails that
generate the false positive in pastebin so that we can see exactly what
is happening. Since we are talking about a body rule here, it is ok to
munge the headers if you are worried about privacy. If you make any
changes to the subject or body, please run it through SA afterwards to
make sure it still generates the false positive.

--
Bowie


brennan at columbia

Apr 18, 2012, 10:23 AM

Post #12 of 12 (557 views)
Permalink
Re: spamassassin rule set issue [In reply to]

>> rawbody BLOCK_RULE2 /(\W|^)Orange(\W|^)/i

> Some good suggestions here already. While your original regexp should
> have worked in most cases, the optimal regexp for this situation is:
>
> /\borange\b/i
>


And probably body, not rawbody. Rawbody won't match if the spammer
obfuscates words with html tags, e.g. ora<tag>nge.

Joseph Brennan
Columbia University Information Technology

SpamAssassin users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.