
parkerm at pobox
Sep 24, 2009, 12:16 AM
Post #1 of 1
(202 views)
Permalink
|
|
Fwd: split_into_array_of_short_paragraphs vs short_lines
|
|
Anyone have any thoughts on this? the BLANK_LINES rules appear to be broken. Michael Begin forwarded message: > From: Michael Parker <parkerm [at] pobox> > Date: September 21, 2009 10:22:26 AM CDT > To: SpamAssassin Dev <dev [at] spamassassin> > Subject: split_into_array_of_short_paragraphs vs short_lines > > Howdy, > > I was looking at why our old 3.1 instance of SA was hitting a few of > the BLANK_LINES_NN_NN rules where as 3.3 stuff wasn't hitting at > all. I narrowed it down to what get_decoded_body_text_array returns. > > For instance, I have a short mail that is 4 lines long. In 3.1 > get_decoded_body_text_array would return an array with 4 elements > (ie lines), however in 3.3 that call now returns a single element > array with a "paragraph." > > This of course breaks this code in check_blank_line_ratio: > > if (scalar @{$fulltext} >= $minlines) { > foreach my $line (@{$fulltext}) { > next if ($line =~ /\S/); > $blank++; > } > $pms->{blank_line_ratio}->{$minlines} = 100 * $blank / scalar > @{$fulltext}; > } > else { > $pms->{blank_line_ratio}->{$minlines} = -1; # don't report if > it's a blank message ... > } > > Because its looking at array elements and not actual lines. > > I suspect there may be other similar eval rules doing that same thing. > > Thoughts? > > Michael
|