Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: SpamAssassin: devel

[Bug 6959] Malformed UTF-8 character - in transliteration at DnsResolver.pm line 627

 

 

SpamAssassin devel RSS feed   Index | Next | Previous | View Threaded


bugzilla-daemon at bugzilla

Jul 16, 2013, 5:37 AM

Post #1 of 7 (77 views)
Permalink
[Bug 6959] Malformed UTF-8 character - in transliteration at DnsResolver.pm line 627

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6959

--- Comment #1 from Mark Martinec <Mark.Martinec [at] ijs> ---
Created attachment 5159
--> https://issues.apache.org/SpamAssassin/attachment.cgi?id=5159&action=edit
A sample message triggering this problem

--
You are receiving this mail because:
You are the assignee for the bug.


bugzilla-daemon at bugzilla

Jul 25, 2013, 5:43 AM

Post #2 of 7 (64 views)
Permalink
[Bug 6959] Malformed UTF-8 character - in transliteration at DnsResolver.pm line 627 [In reply to]

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6959

--- Comment #2 from Mark Martinec <Mark.Martinec [at] ijs> ---
I see, a bug/misfeature in Net::DNS : we give a properly "RFC 1035 zone
format" -encoded string of bytes (no utf-8 flag) to Net::DNS::Packet->new,
and it gratuitously flags it with an utf-8 flag in the packet->question
section, without checking that the given string of octets really
represents an UTF-8 encoded string, and without any reason to do so,
as DNS is 8-bit clean and works with octet strings / has no notion of
character sets and encodings.

When we later pull that packet->question string back and start processing
it, it blows in our face, as it is flagged as a proper character string,
but it isn't.

--
You are receiving this mail because:
You are the assignee for the bug.


bugzilla-daemon at bugzilla

Jul 25, 2013, 7:34 AM

Post #3 of 7 (64 views)
Permalink
[Bug 6959] Malformed UTF-8 character - in transliteration at DnsResolver.pm line 627 [In reply to]

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6959

Kevin A. McGrail <kmcgrail [at] pccc> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |kmcgrail [at] pccc

--- Comment #3 from Kevin A. McGrail <kmcgrail [at] pccc> ---
Good catch. Do we need to open a bug with Net::DNS, require a specific version
or is there a workaround do you think?

--
You are receiving this mail because:
You are the assignee for the bug.


bugzilla-daemon at bugzilla

Jul 25, 2013, 8:37 AM

Post #4 of 7 (64 views)
Permalink
[Bug 6959] Malformed UTF-8 character - in transliteration at DnsResolver.pm line 627 [In reply to]

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6959

--- Comment #4 from Mark Martinec <Mark.Martinec [at] ijs> ---
I'm partly retracting my statement: we do get back from Net::DNS
in a question section an utf8 flagged string (despite us supplying
plain bytes). The resulting string is not actually invalid, it is
encoded as ascii in RFC 1035 zone format encoding, so it is
quite unnecessarily flagged as utf8. An invalid character string
is formed when we try to decode the RFC 1035 zone format encoding
while keeping the utf8 flag unchanged.

The cleanest fix would be for Net::DNS not to gratuitously turn
on the utf8 flag on RFC 1035 zone format encoded strings.
A fix/workaround on our side is to decode a returned character
string to bytes before doing the RFC 1035 zone format decoding
on them. ...

... (not to mention a wish/ramble that Net::DNS should start offering
an 8-bit clean API instead of requiring an application to decode/encode
domain names in a format intended for zone file editing. After all,
Perl as well as the DNS system are both 8-bit clean and allow arbitrary
bytes (including nulls) in labels and results)

The solution sounds like a simple call to Encoder::encode, although
that routine (and Encode::is_utf8 and Encode::_utf8_off) have a long
history of misbehaving on various versions of Perl, especially
when dealing with tainted strings.

Working on it...

--
You are receiving this mail because:
You are the assignee for the bug.


bugzilla-daemon at bugzilla

Jul 25, 2013, 9:29 AM

Post #5 of 7 (64 views)
Permalink
[Bug 6959] Malformed UTF-8 character - in transliteration at DnsResolver.pm line 627 [In reply to]

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6959

--- Comment #5 from Mark Martinec <Mark.Martinec [at] ijs> ---
Btw, I'll leave to somebody else to decide (and possibly fix) if a query
for an A record of a domain name:

www.moe.gov.cn<A3><AC><D1><A7><D0><C5><CD><F8><CF><B5><BD><CC><D3><FD>
<B2><BF><CE><A8><D2><BB><D6><B8><B6><A8<D1><A7><C0><FA><C8><CF><D6><A4>
<B2><E9><D1><AF><CD><F8><A3><AC><CD><F8><D6><B7>www.chsi.com.cn

(as generated from the attached sample message) is worth making
or worth the trouble of avoiding. As far as a DNS system is concerned
the query is valid, and my intention is to make sure it can be queried.

--
You are receiving this mail because:
You are the assignee for the bug.


bugzilla-daemon at bugzilla

Jul 25, 2013, 9:36 AM

Post #6 of 7 (64 views)
Permalink
[Bug 6959] Malformed UTF-8 character - in transliteration at DnsResolver.pm line 627 [In reply to]

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6959

--- Comment #6 from Kevin A. McGrail <kmcgrail [at] pccc> ---
(In reply to Mark Martinec from comment #5)
> Btw, I'll leave to somebody else to decide (and possibly fix) if a query
> for an A record of a domain name:
>
> www.moe.gov.cn<A3><AC><D1><A7><D0><C5><CD><F8><CF><B5><BD><CC><D3><FD>
> <B2><BF><CE><A8><D2><BB><D6><B8><B6><A8<D1><A7><C0><FA><C8><CF><D6><A4>
> <B2><E9><D1><AF><CD><F8><A3><AC><CD><F8><D6><B7>www.chsi.com.cn
>
> (as generated from the attached sample message) is worth making
> or worth the trouble of avoiding. As far as a DNS system is concerned
> the query is valid, and my intention is to make sure it can be queried.

As a lazy English-only speaker, I'm at a loss for comment because I've never
seen a valid link like that. Dear me.

--
You are receiving this mail because:
You are the assignee for the bug.


bugzilla-daemon at bugzilla

Jul 25, 2013, 4:04 PM

Post #7 of 7 (64 views)
Permalink
[Bug 6959] Malformed UTF-8 character - in transliteration at DnsResolver.pm line 627 [In reply to]

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6959

--- Comment #7 from Mark Martinec <Mark.Martinec [at] ijs> ---
Created attachment 5161
--> https://issues.apache.org/SpamAssassin/attachment.cgi?id=5161&action=edit
Proposed patch

How about this fix (attached and comitted):

trunk:
Bug 6959 - Malformed UTF-8 character in transliteration at DnsResolver.pm
Sending lib/Mail/SpamAssassin/Dns.pm
Sending lib/Mail/SpamAssassin/DnsResolver.pm
Sending lib/Mail/SpamAssassin/Plugin/AskDNS.pm
Sending lib/Mail/SpamAssassin/Util.pm
Committed revision 1507148.

- Util.pm: encode character string as it comes from a Net::DNS::Packet
query section back into bytes before "RFC 1035 zone file format"-decoding
it to avoid producing an invalid character string

- Util.pm: rename fmt_dns_question_entry to decode_dns_question_entry
to better reflect its purpose

- DnsResolver::send(): encode a character string (with an utf8 flag)
as it comes from a SPF plugin to plain bytes in order to avoid
challenging Net::DNS

- DnsResolver.pm: issue an info message if a domain name is
a character string (with an utf8 flag) instead of plain bytes
(seems the source of these was exclusively the SPF plugin,
actually the underlying SPF perl module)

--
You are receiving this mail because:
You are the assignee for the bug.

SpamAssassin devel RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.