Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: SpamAssassin: devel

[Bug 6788] URL detection sometimes does not work

 

 

SpamAssassin devel RSS feed   Index | Next | Previous | View Threaded


bugzilla-daemon at bugzilla

May 14, 2012, 9:02 AM

Post #1 of 3 (247 views)
Permalink
[Bug 6788] URL detection sometimes does not work

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6788

Kevin A. McGrail <kmcgrail [at] pccc> changed:

What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
CC| |kmcgrail [at] pccc
Resolution|--- |DUPLICATE

--- Comment #4 from Kevin A. McGrail <kmcgrail [at] pccc> ---
I believe this is a duplicate of a bug already in the system.

*** This bug has been marked as a duplicate of bug 6751 ***

--
You are receiving this mail because:
You are the assignee for the bug.


bugzilla-daemon at bugzilla

May 14, 2012, 9:14 AM

Post #2 of 3 (231 views)
Permalink
[Bug 6788] URL detection sometimes does not work [In reply to]

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6788

Lemat <lemat [at] lemat> changed:

What |Removed |Added
----------------------------------------------------------------------------
Resolution|DUPLICATE |WORKSFORME

--- Comment #5 from Lemat <lemat [at] lemat> ---
this is not a duplicate of bug 6751. The dot is always 2E hex.

--
You are receiving this mail because:
You are the assignee for the bug.


bugzilla-daemon at bugzilla

May 14, 2012, 12:43 PM

Post #3 of 3 (230 views)
Permalink
[Bug 6788] URL detection sometimes does not work [In reply to]

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6788

Kevin A. McGrail <kmcgrail [at] pccc> changed:

What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |REOPENED
Resolution|WORKSFORME |---

--- Comment #6 from Kevin A. McGrail <kmcgrail [at] pccc> ---
(In reply to comment #5)
> this is not a duplicate of bug 6751. The dot is always 2E hex.

Sorry about that. I viewed that as an alternate character being used and lumped
it together.

Reopening though I tried your small fix in HTML.pm

Index: lib/Mail/SpamAssassin/HTML.pm
===================================================================
--- lib/Mail/SpamAssassin/HTML.pm (revision 1338322)
+++ lib/Mail/SpamAssassin/HTML.pm (working copy)
@@ -240,6 +240,10 @@
# the HTML::Parser API won't do it for us
$text =~ s/<(\w+)\s*\/>/<$1>/gi;

+ # Bug 6788 https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6788
+ # we want a space after a closing tag so that URLs aren't lumped together
+ $text =~ s/>/> /g;
+
# Ignore stupid warning that can't be suppressed: 'Parsing of
# undecoded UTF-8 will give garbage when decoding entities at ..' (bug 4046)
{


This breaks html_obfu.t

t/html_obfu.t 9 5 55.56% 1-5

Thoughts?

--
You are receiving this mail because:
You are the assignee for the bug.

SpamAssassin devel RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.