Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: SpamAssassin: users

Shortcircuit Question

 

 

SpamAssassin users RSS feed   Index | Next | Previous | View Threaded


inetadmin at ruraltel

May 7, 2008, 9:19 PM

Post #1 of 7 (130 views)
Permalink
Shortcircuit Question

I have been reading throught the Shortcircuit manpage as well as some
articles within the Wiki, and the manner in which I see it performing
within our install does not seem to coincide with how I am reading and
presumably understanding it to work.

First off, we are using SpamAssassin 3.2.4 provided by the rpmforge
batch of rpm's.

I have about a dozen priorities specified mainly handling URIBL, SURBL,
as well as DCC, Razor, Pyzor and Bayes.

What I am seeing is that although the first shortcircuit rule hits and
scores appropriately. Subsequent short circuit rules will continue to
fire. The scores themselves are then totaled along with the original
scores for the rules.

My understanding of how the shortcircuit should work is that once a
shortcircuit is triggered any subsequent rules should be bypassed and
the message wither classified as spam/ham or if set to on, it would use
the current score specified for the rule.

As a for instance:

I have the following:

priority URIBL_BLACK -500
priority URIBL_JP_SURBL -498
priority URIBL_SC_SURBL -488
priority URIBL_OB_SURBL -487

priority SC_URIBL_SURBL -480
priority SC_URIBL_SBL -479

priority RAZOR2_CHECK -450
priority DCC_CHECK -449
priority PYZOR_CHECK -448

priority SC_URIBL_HASH -440

meta SC_URIBL_SURBL (URIBL_BLACK && (URIBL_SC_SURBL

|| URIBL_JP_SURBL || URIBL_OB_SURBL))

meta SC_URIBL_SBL ((URIBL_BLACK || URIBL_SC_SURBL ||

URIBL_JP_SURBL || URIBL_OB_SURBL) && URIBL_SBL)

meta SC_URIBL_HASH ((URIBL_BLACK || URIBL_SC_SURBL ||

URIBL_JP_SURBL || URIBL_OB_SURBL) &&
(RAZOR2_CHECK || DCC_CHECK || PYZOR_CHECK))



meta SC_URIBL_SBL ((URIBL_BLACK || URIBL_SC_SURBL ||

URIBL_JP_SURBL || URIBL_OB_SURBL) &&
URIBL_SBL)

shortcircuit SC_URIBL_SURBL spam
shortcircuit SC_URIBL_SBL spam
shortcircuit SC_URIBL_HASH spam

score SC_URIBL_SURBL 100.00
score SC_URIBL_HASH 100.00
score SC_URIBL_SBL 100.00


I do not have a recent debug to show but I can say that from the debug I
do see the SC_URIBL_SURBL trigger, after the earlier priority rules are
ran. However, the remaining priorities are then ran, and if meeting
critera, the RAZOR, DCC, and PYZOR rules run and then the SC_URIBL_HASH
rule would trigger. Thus giving a total score of 200 + the scores for
the URIBL/SURBL scores that hit and if included the Razor, DCC, and
Pyzor scores as well.

I was thinking after the SC_URIBL_SURBL was triggered remaining rules
would not run, and the spam classification would take precendence.

Am I overlooking the obvious, have I misunderstood how the SC should
work, is it something with the rpm that was released by rpmforge? Any
thoughts or insight would be appreciated.

Clay


mkettler_sa at verizon

May 7, 2008, 11:07 PM

Post #2 of 7 (123 views)
Permalink
Re: Shortcircuit Question [In reply to]

Clayton Keller wrote:
> I have been reading throught the Shortcircuit manpage as well as some
> articles within the Wiki, and the manner in which I see it performing
> within our install does not seem to coincide with how I am reading and
> presumably understanding it to work.
>
> First off, we are using SpamAssassin 3.2.4 provided by the rpmforge
> batch of rpm's.
>
> I have about a dozen priorities specified mainly handling URIBL,
> SURBL, as well as DCC, Razor, Pyzor and Bayes.
>
> What I am seeing is that although the first shortcircuit rule hits and
> scores appropriately. Subsequent short circuit rules will continue to
> fire. The scores themselves are then totaled along with the original
> scores for the rules.
>
> My understanding of how the shortcircuit should work is that once a
> shortcircuit is triggered any subsequent rules should be bypassed and
> the message wither classified as spam/ham or if set to on, it would
> use the current score specified for the rule.
>
> As a for instance:
>
> I have the following:
>
> priority URIBL_BLACK -500
> priority URIBL_JP_SURBL -498
> priority URIBL_SC_SURBL -488
> priority URIBL_OB_SURBL -487
>
> priority SC_URIBL_SURBL -480
> priority SC_URIBL_SBL -479
>
> priority RAZOR2_CHECK -450
> priority DCC_CHECK -449
> priority PYZOR_CHECK -448
>
> priority SC_URIBL_HASH -440
>
> meta SC_URIBL_SURBL (URIBL_BLACK && (URIBL_SC_SURBL
> || URIBL_JP_SURBL || URIBL_OB_SURBL))
>
> meta SC_URIBL_SBL ((URIBL_BLACK || URIBL_SC_SURBL ||
> URIBL_JP_SURBL || URIBL_OB_SURBL) &&
> URIBL_SBL)
>
> meta SC_URIBL_HASH ((URIBL_BLACK || URIBL_SC_SURBL ||
> URIBL_JP_SURBL || URIBL_OB_SURBL) &&
> (RAZOR2_CHECK || DCC_CHECK || PYZOR_CHECK))
>
>
> meta SC_URIBL_SBL ((URIBL_BLACK || URIBL_SC_SURBL ||
> URIBL_JP_SURBL || URIBL_OB_SURBL) &&
> URIBL_SBL)
>
> shortcircuit SC_URIBL_SURBL spam
> shortcircuit SC_URIBL_SBL spam
> shortcircuit SC_URIBL_HASH spam
>
> score SC_URIBL_SURBL 100.00
> score SC_URIBL_HASH 100.00
> score SC_URIBL_SBL 100.00
>
>
> I do not have a recent debug to show but I can say that from the debug
> I do see the SC_URIBL_SURBL trigger, after the earlier priority rules
> are ran. However, the remaining priorities are then ran, and if
> meeting critera, the RAZOR, DCC, and PYZOR rules run and then the
> SC_URIBL_HASH rule would trigger. Thus giving a total score of 200 +
> the scores for the URIBL/SURBL scores that hit and if included the
> Razor, DCC, and Pyzor scores as well.
>
> I was thinking after the SC_URIBL_SURBL was triggered remaining rules
> would not run, and the spam classification would take precendence.
>
> Am I overlooking the obvious, have I misunderstood how the SC should
> work, is it something with the rpm that was released by rpmforge? Any
> thoughts or insight would be appreciated.

SA is, rather fortunately, circumventing what you're trying to do
because of how DNS is handled internally.

DO NOT try to split up the priority of DNS based tests. Priority and
shortcircuiting is intended to be used on *fast* rules, not slow ones.

If you were successful, you would make the performance of SpamAssassin
absurdly slow by serializing DNS queries. *OUCH*. SA normally runs these
in parallel, and running them in serial would very seriously impact
performance.

Currently, all DNS based tests "run" at their priority, but that only
launches the DNS queries. All the results are gathered together at
HARVEST_DNSBL_PRIORITY, which is currently set to 500. None of the rules
will actually trigger until this point.


jm at jmason

May 8, 2008, 1:49 AM

Post #3 of 7 (121 views)
Permalink
Re: Shortcircuit Question [In reply to]

Matt Kettler writes:
> Clayton Keller wrote:
> > I have been reading throught the Shortcircuit manpage as well as some
> > articles within the Wiki, and the manner in which I see it performing
> > within our install does not seem to coincide with how I am reading and
> > presumably understanding it to work.
> >
> > First off, we are using SpamAssassin 3.2.4 provided by the rpmforge
> > batch of rpm's.
> >
> > I have about a dozen priorities specified mainly handling URIBL,
> > SURBL, as well as DCC, Razor, Pyzor and Bayes.
> >
> > What I am seeing is that although the first shortcircuit rule hits and
> > scores appropriately. Subsequent short circuit rules will continue to
> > fire. The scores themselves are then totaled along with the original
> > scores for the rules.
> >
> > My understanding of how the shortcircuit should work is that once a
> > shortcircuit is triggered any subsequent rules should be bypassed and
> > the message wither classified as spam/ham or if set to on, it would
> > use the current score specified for the rule.
> >
> > As a for instance:
> >
> > I have the following:
> >
> > priority URIBL_BLACK -500
> > priority URIBL_JP_SURBL -498
> > priority URIBL_SC_SURBL -488
> > priority URIBL_OB_SURBL -487
> >
> > priority SC_URIBL_SURBL -480
> > priority SC_URIBL_SBL -479
> >
> > priority RAZOR2_CHECK -450
> > priority DCC_CHECK -449
> > priority PYZOR_CHECK -448
> >
> > priority SC_URIBL_HASH -440
> >
> > meta SC_URIBL_SURBL (URIBL_BLACK && (URIBL_SC_SURBL
> > || URIBL_JP_SURBL || URIBL_OB_SURBL))
> >
> > meta SC_URIBL_SBL ((URIBL_BLACK || URIBL_SC_SURBL ||
> > URIBL_JP_SURBL || URIBL_OB_SURBL) &&
> > URIBL_SBL)
> >
> > meta SC_URIBL_HASH ((URIBL_BLACK || URIBL_SC_SURBL ||
> > URIBL_JP_SURBL || URIBL_OB_SURBL) &&
> > (RAZOR2_CHECK || DCC_CHECK || PYZOR_CHECK))
> >
> >
> > meta SC_URIBL_SBL ((URIBL_BLACK || URIBL_SC_SURBL ||
> > URIBL_JP_SURBL || URIBL_OB_SURBL) &&
> > URIBL_SBL)
> >
> > shortcircuit SC_URIBL_SURBL spam
> > shortcircuit SC_URIBL_SBL spam
> > shortcircuit SC_URIBL_HASH spam
> >
> > score SC_URIBL_SURBL 100.00
> > score SC_URIBL_HASH 100.00
> > score SC_URIBL_SBL 100.00
> >
> >
> > I do not have a recent debug to show but I can say that from the debug
> > I do see the SC_URIBL_SURBL trigger, after the earlier priority rules
> > are ran. However, the remaining priorities are then ran, and if
> > meeting critera, the RAZOR, DCC, and PYZOR rules run and then the
> > SC_URIBL_HASH rule would trigger. Thus giving a total score of 200 +
> > the scores for the URIBL/SURBL scores that hit and if included the
> > Razor, DCC, and Pyzor scores as well.
> >
> > I was thinking after the SC_URIBL_SURBL was triggered remaining rules
> > would not run, and the spam classification would take precendence.
> >
> > Am I overlooking the obvious, have I misunderstood how the SC should
> > work, is it something with the rpm that was released by rpmforge? Any
> > thoughts or insight would be appreciated.
>
> SA is, rather fortunately, circumventing what you're trying to do
> because of how DNS is handled internally.
>
> DO NOT try to split up the priority of DNS based tests. Priority and
> shortcircuiting is intended to be used on *fast* rules, not slow ones.
>
> If you were successful, you would make the performance of SpamAssassin
> absurdly slow by serializing DNS queries. *OUCH*. SA normally runs these
> in parallel, and running them in serial would very seriously impact
> performance.
>
> Currently, all DNS based tests "run" at their priority, but that only
> launches the DNS queries. All the results are gathered together at
> HARVEST_DNSBL_PRIORITY, which is currently set to 500. None of the rules
> will actually trigger until this point.

actually Matt, you're wrong ;) if some of the network rules are
at a higher priority than others, and are used in shortcircuit rules,
SpamAssassin 3.2.x will indeed sleep until the results of those rules
arrive.

The idea is that, if you have the memory to support that degree of
concurrency, you can make a local policy decision to do that, instead
of doing the lookups at the MTA level which does effectively the
same thing.

This wait is logged, so you can spot it with --debug on.

--j.


inetadmin at ruraltel

May 8, 2008, 6:35 AM

Post #4 of 7 (121 views)
Permalink
Re: Shortcircuit Question [In reply to]

Justin Mason wrote:
> Matt Kettler writes:
>> Clayton Keller wrote:
>>> I have been reading throught the Shortcircuit manpage as well as some
>>> articles within the Wiki, and the manner in which I see it performing
>>> within our install does not seem to coincide with how I am reading and
>>> presumably understanding it to work.
>>>
>>> First off, we are using SpamAssassin 3.2.4 provided by the rpmforge
>>> batch of rpm's.
>>>
>>> I have about a dozen priorities specified mainly handling URIBL,
>>> SURBL, as well as DCC, Razor, Pyzor and Bayes.
>>>
>>> What I am seeing is that although the first shortcircuit rule hits and
>>> scores appropriately. Subsequent short circuit rules will continue to
>>> fire. The scores themselves are then totaled along with the original
>>> scores for the rules.
>>>
>>> My understanding of how the shortcircuit should work is that once a
>>> shortcircuit is triggered any subsequent rules should be bypassed and
>>> the message wither classified as spam/ham or if set to on, it would
>>> use the current score specified for the rule.
>>>
>>> As a for instance:
>>>
>>> I have the following:
>>>
>>> priority URIBL_BLACK -500
>>> priority URIBL_JP_SURBL -498
>>> priority URIBL_SC_SURBL -488
>>> priority URIBL_OB_SURBL -487
>>>
>>> priority SC_URIBL_SURBL -480
>>> priority SC_URIBL_SBL -479
>>>
>>> priority RAZOR2_CHECK -450
>>> priority DCC_CHECK -449
>>> priority PYZOR_CHECK -448
>>>
>>> priority SC_URIBL_HASH -440
>>>
>>> meta SC_URIBL_SURBL (URIBL_BLACK && (URIBL_SC_SURBL
>>> || URIBL_JP_SURBL || URIBL_OB_SURBL))
>>>
>>> meta SC_URIBL_SBL ((URIBL_BLACK || URIBL_SC_SURBL ||
>>> URIBL_JP_SURBL || URIBL_OB_SURBL) &&
>>> URIBL_SBL)
>>>
>>> meta SC_URIBL_HASH ((URIBL_BLACK || URIBL_SC_SURBL ||
>>> URIBL_JP_SURBL || URIBL_OB_SURBL) &&
>>> (RAZOR2_CHECK || DCC_CHECK || PYZOR_CHECK))
>>>
>>>
>>> meta SC_URIBL_SBL ((URIBL_BLACK || URIBL_SC_SURBL ||
>>> URIBL_JP_SURBL || URIBL_OB_SURBL) &&
>>> URIBL_SBL)
>>>
>>> shortcircuit SC_URIBL_SURBL spam
>>> shortcircuit SC_URIBL_SBL spam
>>> shortcircuit SC_URIBL_HASH spam
>>>
>>> score SC_URIBL_SURBL 100.00
>>> score SC_URIBL_HASH 100.00
>>> score SC_URIBL_SBL 100.00
>>>
>>>
>>> I do not have a recent debug to show but I can say that from the debug
>>> I do see the SC_URIBL_SURBL trigger, after the earlier priority rules
>>> are ran. However, the remaining priorities are then ran, and if
>>> meeting critera, the RAZOR, DCC, and PYZOR rules run and then the
>>> SC_URIBL_HASH rule would trigger. Thus giving a total score of 200 +
>>> the scores for the URIBL/SURBL scores that hit and if included the
>>> Razor, DCC, and Pyzor scores as well.
>>>
>>> I was thinking after the SC_URIBL_SURBL was triggered remaining rules
>>> would not run, and the spam classification would take precendence.
>>>
>>> Am I overlooking the obvious, have I misunderstood how the SC should
>>> work, is it something with the rpm that was released by rpmforge? Any
>>> thoughts or insight would be appreciated.
>> SA is, rather fortunately, circumventing what you're trying to do
>> because of how DNS is handled internally.
>>
>> DO NOT try to split up the priority of DNS based tests. Priority and
>> shortcircuiting is intended to be used on *fast* rules, not slow ones.
>>
>> If you were successful, you would make the performance of SpamAssassin
>> absurdly slow by serializing DNS queries. *OUCH*. SA normally runs these
>> in parallel, and running them in serial would very seriously impact
>> performance.
>>
>> Currently, all DNS based tests "run" at their priority, but that only
>> launches the DNS queries. All the results are gathered together at
>> HARVEST_DNSBL_PRIORITY, which is currently set to 500. None of the rules
>> will actually trigger until this point.
>
> actually Matt, you're wrong ;) if some of the network rules are
> at a higher priority than others, and are used in shortcircuit rules,
> SpamAssassin 3.2.x will indeed sleep until the results of those rules
> arrive.
>
> The idea is that, if you have the memory to support that degree of
> concurrency, you can make a local policy decision to do that, instead
> of doing the lookups at the MTA level which does effectively the
> same thing.
>
> This wait is logged, so you can spot it with --debug on.
>
> --j.
>

With that said Justin, is the behavior I am seeing correct? Even though
the first prioritized shortcircuit rule hits, and I see that in the
debug log, shouldn't it be bypassing the remaining rules rather than
continuing to process until all the shortcircuit priorities have ran?

From reading the initial bug when this was originally featured, along
with the man page, as well as a wiki post with an example by you, that
is how I understood it to function.

Clay


jm at jmason

May 8, 2008, 6:51 AM

Post #5 of 7 (120 views)
Permalink
Re: Shortcircuit Question [In reply to]

Clayton Keller writes:
> Justin Mason wrote:
> > Matt Kettler writes:
> >> Clayton Keller wrote:
> >>> I have been reading throught the Shortcircuit manpage as well as some
> >>> articles within the Wiki, and the manner in which I see it performing
> >>> within our install does not seem to coincide with how I am reading and
> >>> presumably understanding it to work.
> >>>
> >>> First off, we are using SpamAssassin 3.2.4 provided by the rpmforge
> >>> batch of rpm's.
> >>>
> >>> I have about a dozen priorities specified mainly handling URIBL,
> >>> SURBL, as well as DCC, Razor, Pyzor and Bayes.
> >>>
> >>> What I am seeing is that although the first shortcircuit rule hits and
> >>> scores appropriately. Subsequent short circuit rules will continue to
> >>> fire. The scores themselves are then totaled along with the original
> >>> scores for the rules.
> >>>
> >>> My understanding of how the shortcircuit should work is that once a
> >>> shortcircuit is triggered any subsequent rules should be bypassed and
> >>> the message wither classified as spam/ham or if set to on, it would
> >>> use the current score specified for the rule.
> >>>
> >>> As a for instance:
> >>>
> >>> I have the following:
> >>>
> >>> priority URIBL_BLACK -500
> >>> priority URIBL_JP_SURBL -498
> >>> priority URIBL_SC_SURBL -488
> >>> priority URIBL_OB_SURBL -487
> >>>
> >>> priority SC_URIBL_SURBL -480
> >>> priority SC_URIBL_SBL -479
> >>>
> >>> priority RAZOR2_CHECK -450
> >>> priority DCC_CHECK -449
> >>> priority PYZOR_CHECK -448
> >>>
> >>> priority SC_URIBL_HASH -440
> >>>
> >>> meta SC_URIBL_SURBL (URIBL_BLACK && (URIBL_SC_SURBL
> >>> || URIBL_JP_SURBL || URIBL_OB_SURBL))
> >>>
> >>> meta SC_URIBL_SBL ((URIBL_BLACK || URIBL_SC_SURBL ||
> >>> URIBL_JP_SURBL || URIBL_OB_SURBL) &&
> >>> URIBL_SBL)
> >>>
> >>> meta SC_URIBL_HASH ((URIBL_BLACK || URIBL_SC_SURBL ||
> >>> URIBL_JP_SURBL || URIBL_OB_SURBL) &&
> >>> (RAZOR2_CHECK || DCC_CHECK || PYZOR_CHECK))
> >>>
> >>>
> >>> meta SC_URIBL_SBL ((URIBL_BLACK || URIBL_SC_SURBL ||
> >>> URIBL_JP_SURBL || URIBL_OB_SURBL) &&
> >>> URIBL_SBL)
> >>>
> >>> shortcircuit SC_URIBL_SURBL spam
> >>> shortcircuit SC_URIBL_SBL spam
> >>> shortcircuit SC_URIBL_HASH spam
> >>>
> >>> score SC_URIBL_SURBL 100.00
> >>> score SC_URIBL_HASH 100.00
> >>> score SC_URIBL_SBL 100.00
> >>>
> >>>
> >>> I do not have a recent debug to show but I can say that from the debug
> >>> I do see the SC_URIBL_SURBL trigger, after the earlier priority rules
> >>> are ran. However, the remaining priorities are then ran, and if
> >>> meeting critera, the RAZOR, DCC, and PYZOR rules run and then the
> >>> SC_URIBL_HASH rule would trigger. Thus giving a total score of 200 +
> >>> the scores for the URIBL/SURBL scores that hit and if included the
> >>> Razor, DCC, and Pyzor scores as well.
> >>>
> >>> I was thinking after the SC_URIBL_SURBL was triggered remaining rules
> >>> would not run, and the spam classification would take precendence.
> >>>
> >>> Am I overlooking the obvious, have I misunderstood how the SC should
> >>> work, is it something with the rpm that was released by rpmforge? Any
> >>> thoughts or insight would be appreciated.
> >> SA is, rather fortunately, circumventing what you're trying to do
> >> because of how DNS is handled internally.
> >>
> >> DO NOT try to split up the priority of DNS based tests. Priority and
> >> shortcircuiting is intended to be used on *fast* rules, not slow ones.
> >>
> >> If you were successful, you would make the performance of SpamAssassin
> >> absurdly slow by serializing DNS queries. *OUCH*. SA normally runs these
> >> in parallel, and running them in serial would very seriously impact
> >> performance.
> >>
> >> Currently, all DNS based tests "run" at their priority, but that only
> >> launches the DNS queries. All the results are gathered together at
> >> HARVEST_DNSBL_PRIORITY, which is currently set to 500. None of the rules
> >> will actually trigger until this point.
> >
> > actually Matt, you're wrong ;) if some of the network rules are
> > at a higher priority than others, and are used in shortcircuit rules,
> > SpamAssassin 3.2.x will indeed sleep until the results of those rules
> > arrive.
> >
> > The idea is that, if you have the memory to support that degree of
> > concurrency, you can make a local policy decision to do that, instead
> > of doing the lookups at the MTA level which does effectively the
> > same thing.
> >
> > This wait is logged, so you can spot it with --debug on.
> >
> > --j.
> >
>
> With that said Justin, is the behavior I am seeing correct? Even though
> the first prioritized shortcircuit rule hits, and I see that in the
> debug log, shouldn't it be bypassing the remaining rules rather than
> continuing to process until all the shortcircuit priorities have ran?
>
> From reading the initial bug when this was originally featured, along
> with the man page, as well as a wiki post with an example by you, that
> is how I understood it to function.

actually, no, it sounds like a bug. could you open a bugzilla with
a demonstration config/test message?

--j.


inetadmin at ruraltel

May 8, 2008, 9:34 AM

Post #6 of 7 (120 views)
Permalink
Re: Shortcircuit Question [In reply to]

Justin Mason wrote:
> Clayton Keller writes:
>> Justin Mason wrote:
>>> Matt Kettler writes:
>>>> Clayton Keller wrote:
>>>>> I have been reading throught the Shortcircuit manpage as well as some
>>>>> articles within the Wiki, and the manner in which I see it performing
>>>>> within our install does not seem to coincide with how I am reading and
>>>>> presumably understanding it to work.
>>>>>
>>>>> First off, we are using SpamAssassin 3.2.4 provided by the rpmforge
>>>>> batch of rpm's.
>>>>>
>>>>> I have about a dozen priorities specified mainly handling URIBL,
>>>>> SURBL, as well as DCC, Razor, Pyzor and Bayes.
>>>>>
>>>>> What I am seeing is that although the first shortcircuit rule hits and
>>>>> scores appropriately. Subsequent short circuit rules will continue to
>>>>> fire. The scores themselves are then totaled along with the original
>>>>> scores for the rules.
>>>>>
>>>>> My understanding of how the shortcircuit should work is that once a
>>>>> shortcircuit is triggered any subsequent rules should be bypassed and
>>>>> the message wither classified as spam/ham or if set to on, it would
>>>>> use the current score specified for the rule.
>>>>>
>>>>> As a for instance:
>>>>>
>>>>> I have the following:
>>>>>
>>>>> priority URIBL_BLACK -500
>>>>> priority URIBL_JP_SURBL -498
>>>>> priority URIBL_SC_SURBL -488
>>>>> priority URIBL_OB_SURBL -487
>>>>>
>>>>> priority SC_URIBL_SURBL -480
>>>>> priority SC_URIBL_SBL -479
>>>>>
>>>>> priority RAZOR2_CHECK -450
>>>>> priority DCC_CHECK -449
>>>>> priority PYZOR_CHECK -448
>>>>>
>>>>> priority SC_URIBL_HASH -440
>>>>>
>>>>> meta SC_URIBL_SURBL (URIBL_BLACK && (URIBL_SC_SURBL
>>>>> || URIBL_JP_SURBL || URIBL_OB_SURBL))
>>>>>
>>>>> meta SC_URIBL_SBL ((URIBL_BLACK || URIBL_SC_SURBL ||
>>>>> URIBL_JP_SURBL || URIBL_OB_SURBL) &&
>>>>> URIBL_SBL)
>>>>>
>>>>> meta SC_URIBL_HASH ((URIBL_BLACK || URIBL_SC_SURBL ||
>>>>> URIBL_JP_SURBL || URIBL_OB_SURBL) &&
>>>>> (RAZOR2_CHECK || DCC_CHECK || PYZOR_CHECK))
>>>>>
>>>>>
>>>>> meta SC_URIBL_SBL ((URIBL_BLACK || URIBL_SC_SURBL ||
>>>>> URIBL_JP_SURBL || URIBL_OB_SURBL) &&
>>>>> URIBL_SBL)
>>>>>
>>>>> shortcircuit SC_URIBL_SURBL spam
>>>>> shortcircuit SC_URIBL_SBL spam
>>>>> shortcircuit SC_URIBL_HASH spam
>>>>>
>>>>> score SC_URIBL_SURBL 100.00
>>>>> score SC_URIBL_HASH 100.00
>>>>> score SC_URIBL_SBL 100.00
>>>>>
>>>>>
>>>>> I do not have a recent debug to show but I can say that from the debug
>>>>> I do see the SC_URIBL_SURBL trigger, after the earlier priority rules
>>>>> are ran. However, the remaining priorities are then ran, and if
>>>>> meeting critera, the RAZOR, DCC, and PYZOR rules run and then the
>>>>> SC_URIBL_HASH rule would trigger. Thus giving a total score of 200 +
>>>>> the scores for the URIBL/SURBL scores that hit and if included the
>>>>> Razor, DCC, and Pyzor scores as well.
>>>>>
>>>>> I was thinking after the SC_URIBL_SURBL was triggered remaining rules
>>>>> would not run, and the spam classification would take precendence.
>>>>>
>>>>> Am I overlooking the obvious, have I misunderstood how the SC should
>>>>> work, is it something with the rpm that was released by rpmforge? Any
>>>>> thoughts or insight would be appreciated.
>>>> SA is, rather fortunately, circumventing what you're trying to do
>>>> because of how DNS is handled internally.
>>>>
>>>> DO NOT try to split up the priority of DNS based tests. Priority and
>>>> shortcircuiting is intended to be used on *fast* rules, not slow ones.
>>>>
>>>> If you were successful, you would make the performance of SpamAssassin
>>>> absurdly slow by serializing DNS queries. *OUCH*. SA normally runs these
>>>> in parallel, and running them in serial would very seriously impact
>>>> performance.
>>>>
>>>> Currently, all DNS based tests "run" at their priority, but that only
>>>> launches the DNS queries. All the results are gathered together at
>>>> HARVEST_DNSBL_PRIORITY, which is currently set to 500. None of the rules
>>>> will actually trigger until this point.
>>> actually Matt, you're wrong ;) if some of the network rules are
>>> at a higher priority than others, and are used in shortcircuit rules,
>>> SpamAssassin 3.2.x will indeed sleep until the results of those rules
>>> arrive.
>>>
>>> The idea is that, if you have the memory to support that degree of
>>> concurrency, you can make a local policy decision to do that, instead
>>> of doing the lookups at the MTA level which does effectively the
>>> same thing.
>>>
>>> This wait is logged, so you can spot it with --debug on.
>>>
>>> --j.
>>>
>> With that said Justin, is the behavior I am seeing correct? Even though
>> the first prioritized shortcircuit rule hits, and I see that in the
>> debug log, shouldn't it be bypassing the remaining rules rather than
>> continuing to process until all the shortcircuit priorities have ran?
>>
>> From reading the initial bug when this was originally featured, along
>> with the man page, as well as a wiki post with an example by you, that
>> is how I understood it to function.
>
> actually, no, it sounds like a bug. could you open a bugzilla with
> a demonstration config/test message?
>
> --j.
>

Just to add, with my previous debug, I can confirm the waiting of the
tests to finish as you mentioned from the debug. I'll make sure this is
included when the bug is filed.

Also, what happened to independently enabling/disabling autolearning via
tflags? From a previous post a while back, I was informed that due to
performance gains shortcircuit rules were not learned as either ham or
spam due to their results due in part to creating additional performance
gains with the extra bayes processing that is required. Looking through
the original patch thread it appears this was a part of the discussion
to some degree, but I was just wandering what the final say on that was
as well.

Clay


inetadmin at ruraltel

May 8, 2008, 7:44 PM

Post #7 of 7 (115 views)
Permalink
Re: Shortcircuit Question [In reply to]

Clayton Keller wrote:
> Justin Mason wrote:
>> Clayton Keller writes:
>>> Justin Mason wrote:
>>>> Matt Kettler writes:
>>>>> Clayton Keller wrote:
>>>>>> I have been reading throught the Shortcircuit manpage as well as
>>>>>> some articles within the Wiki, and the manner in which I see it
>>>>>> performing within our install does not seem to coincide with how I
>>>>>> am reading and presumably understanding it to work.
>>>>>>
>>>>>> First off, we are using SpamAssassin 3.2.4 provided by the
>>>>>> rpmforge batch of rpm's.
>>>>>>
>>>>>> I have about a dozen priorities specified mainly handling URIBL,
>>>>>> SURBL, as well as DCC, Razor, Pyzor and Bayes.
>>>>>>
>>>>>> What I am seeing is that although the first shortcircuit rule hits
>>>>>> and scores appropriately. Subsequent short circuit rules will
>>>>>> continue to fire. The scores themselves are then totaled along
>>>>>> with the original scores for the rules.
>>>>>>
>>>>>> My understanding of how the shortcircuit should work is that once
>>>>>> a shortcircuit is triggered any subsequent rules should be
>>>>>> bypassed and the message wither classified as spam/ham or if set
>>>>>> to on, it would use the current score specified for the rule.
>>>>>>
>>>>>> As a for instance:
>>>>>>
>>>>>> I have the following:
>>>>>>
>>>>>> priority URIBL_BLACK -500
>>>>>> priority URIBL_JP_SURBL -498
>>>>>> priority URIBL_SC_SURBL -488
>>>>>> priority URIBL_OB_SURBL -487
>>>>>>
>>>>>> priority SC_URIBL_SURBL -480
>>>>>> priority SC_URIBL_SBL -479
>>>>>>
>>>>>> priority RAZOR2_CHECK -450
>>>>>> priority DCC_CHECK -449
>>>>>> priority PYZOR_CHECK -448
>>>>>>
>>>>>> priority SC_URIBL_HASH -440
>>>>>>
>>>>>> meta SC_URIBL_SURBL (URIBL_BLACK && (URIBL_SC_SURBL
>>>>>> || URIBL_JP_SURBL || URIBL_OB_SURBL))
>>>>>>
>>>>>> meta SC_URIBL_SBL ((URIBL_BLACK || URIBL_SC_SURBL ||
>>>>>> URIBL_JP_SURBL || URIBL_OB_SURBL) &&
>>>>>> URIBL_SBL)
>>>>>>
>>>>>> meta SC_URIBL_HASH ((URIBL_BLACK || URIBL_SC_SURBL ||
>>>>>> URIBL_JP_SURBL || URIBL_OB_SURBL) &&
>>>>>> (RAZOR2_CHECK || DCC_CHECK || PYZOR_CHECK))
>>>>>>
>>>>>>
>>>>>> meta SC_URIBL_SBL ((URIBL_BLACK || URIBL_SC_SURBL ||
>>>>>> URIBL_JP_SURBL ||
>>>>>> URIBL_OB_SURBL) && URIBL_SBL)
>>>>>>
>>>>>> shortcircuit SC_URIBL_SURBL spam
>>>>>> shortcircuit SC_URIBL_SBL spam
>>>>>> shortcircuit SC_URIBL_HASH spam
>>>>>>
>>>>>> score SC_URIBL_SURBL 100.00
>>>>>> score SC_URIBL_HASH 100.00
>>>>>> score SC_URIBL_SBL 100.00
>>>>>>
>>>>>>
>>>>>> I do not have a recent debug to show but I can say that from the
>>>>>> debug I do see the SC_URIBL_SURBL trigger, after the earlier
>>>>>> priority rules are ran. However, the remaining priorities are then
>>>>>> ran, and if meeting critera, the RAZOR, DCC, and PYZOR rules run
>>>>>> and then the SC_URIBL_HASH rule would trigger. Thus giving a total
>>>>>> score of 200 + the scores for the URIBL/SURBL scores that hit and
>>>>>> if included the Razor, DCC, and Pyzor scores as well.
>>>>>>
>>>>>> I was thinking after the SC_URIBL_SURBL was triggered remaining
>>>>>> rules would not run, and the spam classification would take
>>>>>> precendence.
>>>>>>
>>>>>> Am I overlooking the obvious, have I misunderstood how the SC
>>>>>> should work, is it something with the rpm that was released by
>>>>>> rpmforge? Any thoughts or insight would be appreciated.
>>>>> SA is, rather fortunately, circumventing what you're trying to do
>>>>> because of how DNS is handled internally.
>>>>>
>>>>> DO NOT try to split up the priority of DNS based tests. Priority
>>>>> and shortcircuiting is intended to be used on *fast* rules, not
>>>>> slow ones.
>>>>>
>>>>> If you were successful, you would make the performance of
>>>>> SpamAssassin absurdly slow by serializing DNS queries. *OUCH*. SA
>>>>> normally runs these in parallel, and running them in serial would
>>>>> very seriously impact performance.
>>>>>
>>>>> Currently, all DNS based tests "run" at their priority, but that
>>>>> only launches the DNS queries. All the results are gathered
>>>>> together at HARVEST_DNSBL_PRIORITY, which is currently set to 500.
>>>>> None of the rules will actually trigger until this point.
>>>> actually Matt, you're wrong ;) if some of the network rules are
>>>> at a higher priority than others, and are used in shortcircuit rules,
>>>> SpamAssassin 3.2.x will indeed sleep until the results of those rules
>>>> arrive.
>>>>
>>>> The idea is that, if you have the memory to support that degree of
>>>> concurrency, you can make a local policy decision to do that, instead
>>>> of doing the lookups at the MTA level which does effectively the
>>>> same thing.
>>>>
>>>> This wait is logged, so you can spot it with --debug on.
>>>>
>>>> --j.
>>>>
>>> With that said Justin, is the behavior I am seeing correct? Even
>>> though the first prioritized shortcircuit rule hits, and I see that
>>> in the debug log, shouldn't it be bypassing the remaining rules
>>> rather than continuing to process until all the shortcircuit
>>> priorities have ran?
>>>
>>> From reading the initial bug when this was originally featured,
>>> along with the man page, as well as a wiki post with an example by
>>> you, that is how I understood it to function.
>>
>> actually, no, it sounds like a bug. could you open a bugzilla with
>> a demonstration config/test message?
>>
>> --j.
>>
>
> Just to add, with my previous debug, I can confirm the waiting of the
> tests to finish as you mentioned from the debug. I'll make sure this is
> included when the bug is filed.
>
> Also, what happened to independently enabling/disabling autolearning via
> tflags? From a previous post a while back, I was informed that due to
> performance gains shortcircuit rules were not learned as either ham or
> spam due to their results due in part to creating additional performance
> gains with the extra bayes processing that is required. Looking through
> the original patch thread it appears this was a part of the discussion
> to some degree, but I was just wandering what the final say on that was
> as well.
>
> Clay
>

Bug 5906 submitted. Thanks for your help on this.

Clay

SpamAssassin users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact lists@gossamer-threads.com
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.