Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: SpamAssassin: users

CHARSET_FARAWAY seems to be ignored

 

 

SpamAssassin users RSS feed   Index | Next | Previous | View Threaded


ashvin at haeseandharris

Apr 12, 2012, 10:39 PM

Post #1 of 8 (703 views)
Permalink
CHARSET_FARAWAY seems to be ignored

I am executing Spamassassin (version 3.2.5) through an Exim4 transport on a
Debian server. Everything in the routing process works fine and my
user_prefs file is consulted as required.

I am trying to mark all non-English emails as spam but the configuration
I've got in my user_prefs file doesn't seem to work. Here's the
X-Spam-USER-REPORT header, in the email that I want to be marked as spam.
---------------------
X-Spam-USER-REPORT:
* -1.0 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/,
low
* trust
* [209.85.214.172 listed in list.dnswl.org]
* -0.0 SPF_PASS SPF: sender matches SPF record
* 0.0 HTML_MESSAGE BODY: HTML included in message
* 6.0 TVD_SPACE_RATIO BODY: TVD_SPACE_RATIO
---------------------

If you notice, TVD_SPACE_RATIO_BODY gives a score of 6.0 which is a custom
score I've set in my user_prefs file just to make sure that at least some of
my rules are honored.

The email in question was created and sent using Gmail with 'Default
transliteration language' set to Arabic. So, CHARSET_FARAWAY should have
been triggered (since ok_locales is set to en) but it doesn't seem like it
has.

Here are my local.cf and user_prefs files:
http://old.nabble.com/file/p33679812/local.cf local.cf

http://old.nabble.com/file/p33679812/user_prefs user_prefs
--
View this message in context: http://old.nabble.com/CHARSET_FARAWAY-seems-to-be-ignored-tp33679812p33679812.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


niamh at fullbore

Apr 12, 2012, 10:57 PM

Post #2 of 8 (675 views)
Permalink
Re: CHARSET_FARAWAY seems to be ignored [In reply to]

Hello haese,

Friday, April 13, 2012, 6:39:00 AM, you wrote:

h> The email in question was created and sent using Gmail with 'Default
h> transliteration language' set to Arabic. So, CHARSET_FARAWAY should have
h> been triggered (since ok_locales is set to en) but it doesn't seem like it
h> has.

Did you edit /etc/mail/spamassassin/v310.pre so that the TextCat
plugin is enabled?

--
Best regards,
Niamh mailto:niamh [at] fullbore


ashvin at haeseandharris

Apr 12, 2012, 11:28 PM

Post #3 of 8 (674 views)
Permalink
Re: CHARSET_FARAWAY seems to be ignored [In reply to]

Thank you for the reply.

I am not using TextCat so didn't load the plugin as you suggest. The
reason is that the following thread:
http://www.mail-archive.com/users [at] spamassassin/msg69225.html

leads me to believe that TextCat might not be suitable for my needs, since
(as quoted within that thread):
"Textcat is not designed to decide what language the email is, but to
find a set of languages it *might* be. It is very prone to declaring
extra languages that are not really present due to it's design"

Here is a another quote from the thread:
http://old.nabble.com/Problems-with-Cyrillic-spam-td32978897.html#a32981171

It reads:
"ok_locales functions by identifying character sets that can only be used
for a specific language. UTF8, Windows-1255, and koi8 are not such
character sets, because they can also be used to write in English."

So, even though 'Default transliteration language' is set to Arabic in
Gmail, could the character set used still be one of UTF8, Windows-1255 or
koi8? In that case, ok_locales would fail correct?


niamh at fullbore

Apr 13, 2012, 12:06 AM

Post #4 of 8 (673 views)
Permalink
Re: CHARSET_FARAWAY seems to be ignored [In reply to]

Hello Ashvin,

Friday, April 13, 2012, 7:28:27 AM, you wrote:

AN> I am not using TextCat so didn't load the plugin as you suggest.

In which case you won't get a hit on that rule as "
eval:check_for_faraway_charset()" uses that plugin if my memory
serves.

--
Best regards,
Niamh mailto:niamh [at] fullbore


swatir88 at gmail

Apr 13, 2012, 12:33 AM

Post #5 of 8 (675 views)
Permalink
Re: CHARSET_FARAWAY seems to be ignored [In reply to]

On Fri, Apr 13, 2012 at 11:58 AM, Ashvin Narayanan <
ashvin [at] haeseandharris> wrote:

> Thank you for the reply.
>
> I am not using TextCat so didn't load the plugin as you suggest. The
> reason is that the following thread:
> http://www.mail-archive.com/users [at] spamassassin/msg69225.html
>
> leads me to believe that TextCat might not be suitable for my needs, since
> (as quoted within that thread):
> "Textcat is not designed to decide what language the email is, but to
> find a set of languages it *might* be. It is very prone to declaring
> extra languages that are not really present due to it's design"
>
>
Thats true. There should not be second opinion.



> Here is a another quote from the thread:
> http://old.nabble.com/Problems-with-Cyrillic-spam-td32978897.html#a32981171
>
> It reads:
> "ok_locales functions by identifying character sets that can only be used
> for a specific language. UTF8, Windows-1255, and koi8 are not such
> character sets, because they can also be used to write in English."
>
> So, even though 'Default transliteration language' is set to Arabic in
> Gmail, could the character set used still be one of UTF8, Windows-1255 or
> koi8? In that case, ok_locales would fail correct?
>
>
>
For me, ok_locales not working at all even if textcat plugin is enabled .
But with ok_languages, following rule gets applied from plugin textcat.(for
arabic language also)

1.5 BODY_8BITS BODY: Body includes 8 consecutive 8-bit
characters

The other rule (UNWANTED_LANGUAGE_BODY) applies sometimes only.
But this plugin is definitely a good one for language based mail filtering.

Regards,
Swati R


me at junc

Apr 13, 2012, 7:54 AM

Post #6 of 8 (659 views)
Permalink
Re: CHARSET FARAWAY seems to be ignored [In reply to]

Den 2012-04-13 07:39, haese skrev:
> I am executing Spamassassin (version 3.2.5) through an Exim4
> transport on a
> Debian server. Everything in the routing process works fine and my
> user_prefs file is consulted as required.

http://archive.apache.org/dist/spamassassin/ hmm current releases from
2005, seems needs updates are needed

but as i remember 3.2.5 was before depricated, but seems on the
download page as still supported ?

http://spamassassin.apache.org/downloads.cgi?update=201106220000 here
3.2.5 is in download, will sa-update work with that release still ?


KMcGrail at PCCC

Apr 13, 2012, 8:15 AM

Post #7 of 8 (663 views)
Permalink
Re: CHARSET_FARAWAY seems to be ignored [In reply to]

On 4/13/2012 10:54 AM, Benny Pedersen wrote:
> Den 2012-04-13 07:39, haese skrev:
>> I am executing Spamassassin (version 3.2.5) through an Exim4
>> transport on a
>> Debian server. Everything in the routing process works fine and my
>> user_prefs file is consulted as required.
>
> http://archive.apache.org/dist/spamassassin/ hmm current releases from
> 2005, seems needs updates are needed
>
> but as i remember 3.2.5 was before depricated, but seems on the
> download page as still supported ?
Being available for download and being supported are two different things.

> http://spamassassin.apache.org/downloads.cgi?update=201106220000 here
> 3.2.5 is in download, will sa-update work with that release still ?
3.2.5 is EOL so there are no guarantees updates will be released for
that. I had in my head that updates were a big reason for the 3.3.0
release but we must have been testing it earlier.

regards,
KAM


me at junc

Apr 13, 2012, 8:33 AM

Post #8 of 8 (669 views)
Permalink
Re: CHARSET FARAWAY seems to be ignored [In reply to]

Den 2012-04-13 17:15, Kevin A. McGrail skrev:
> On 4/13/2012 10:54 AM, Benny Pedersen wrote:
>> Den 2012-04-13 07:39, haese skrev:
>>> I am executing Spamassassin (version 3.2.5) through an Exim4
>>> transport on a
>>> Debian server. Everything in the routing process works fine and my
>>> user_prefs file is consulted as required.
>>
>> http://archive.apache.org/dist/spamassassin/ hmm current releases
>> from 2005, seems needs updates are needed
>>
>> but as i remember 3.2.5 was before depricated, but seems on the
>> download page as still supported ?
> Being available for download and being supported are two different
> things.

correct, i just still try to understand why debian core still have
3.2.5 as current

>> http://spamassassin.apache.org/downloads.cgi?update=201106220000
>> here 3.2.5 is in download, will sa-update work with that release still
>> ?
> 3.2.5 is EOL so there are no guarantees updates will be released for
> that. I had in my head that updates were a big reason for the 3.3.0
> release but we must have been testing it earlier.

hopefully debian maintainers will follow this maillist

SpamAssassin users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.