Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: SpamAssassin: devel

[Bug 6963] New: Anybody cares for a saved millisecond or two in computing bayes probabilities for tokens?

 

 

SpamAssassin devel RSS feed   Index | Next | Previous | View Threaded


bugzilla-daemon at bugzilla

Jul 30, 2013, 11:58 AM

Post #1 of 1 (25 views)
Permalink
[Bug 6963] New: Anybody cares for a saved millisecond or two in computing bayes probabilities for tokens?

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6963

Bug ID: 6963
Summary: Anybody cares for a saved millisecond or two in
computing bayes probabilities for tokens?
Product: Spamassassin
Version: SVN Trunk (Latest Devel Version)
Hardware: All
OS: All
Status: NEW
Severity: enhancement
Priority: P2
Component: Libraries
Assignee: dev [at] spamassassin
Reporter: Mark.Martinec [at] ijs

Wondering where 'b_comp_prob' reported timing entry spends its time
(computing bayes probabilities for tokens), I played a bit with the
beautiful NYTProf perl profiler and shuffled some Bayes code while
keeping its functionality unchanged.

The basic idea is to compute a probability for all tokens in one
go, instead of calling the _compute_prob_for_token() for each token.
This allows for factoring out unchanging sections from the loop.
So instead of:
Plugin::Bayes::_compute_prob_for_token
we now call:
Plugin::Bayes::_compute_prob_for_all_tokens
(and the _compute_prob_for_token() is now just a wrapper).

Savings are less than I hoped, about 1.2 ms for a typical larger
message with one or two hundred tokens, and a barely noticeable
speedup for messages with only a few tokens. When dumping tokens
(sa-learn --dump) the saving is about 6 seconds (out of one minute)
with my current redis database.

Still, the work is done now, I wonder whether we like it folded in,
or not.

--
You are receiving this mail because:
You are the assignee for the bug.

SpamAssassin devel RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.