Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Perl: porters

Re: Encode memory corruption [perl #70528]

 

 

Perl porters RSS feed   Index | Next | Previous | View Threaded


perl at greerga

Nov 15, 2009, 8:38 PM

Post #1 of 5 (362 views)
Permalink
Re: Encode memory corruption [perl #70528]

On Sun, 15 Nov 2009, George Greer wrote:

> I haven't pared the crashing script down enough to post as a test case but
> will do so as soon as I can. I wanted to make sure the bug report was in
> before 5.12 escaped. The test is PerlIO::encoding an input file that contains
> Latin-1 high characters and then re-encoding them for output, but it is 918
> lines at the moment.

Test script:

- - - 8< - - - 8< - - -
use Encode qw[encode];
encode("ISO-8859-1", "\x{b6} \x{b6} \x{b6} \x{b6} \x{b6} \x{b6}
\x{2022}wwwww \x{2022}rrrrr uuu qqqqqqqqq \x{2022}yyyyyyy
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx \x{b6} \x{b6} \x{b6} \x{b6} \x{b6}
\x{b6}", sub { "\x{2022}" });
- - - 8< - - - 8< - - -

(That's supposed to be a single line.)

For me, Ubuntu's perl 5.10.0 crashes, blead (GitLive-blead-3146-g88a6f4f)
also crashes, and blead gives this under valgrind:

==17892== Invalid write of size 1
==17892== at 0x608693F: do_encode (encengine.c:119)
==17892== by 0x607C0B2: encode_method (Encode.xs:128)
==17892== by 0x6081938: XS_Encode__XS_encode (Encode.xs:632)
==17892== by 0x54B86B: Perl_pp_entersub (pp_hot.c:2875)
==17892== by 0x4F64DE: Perl_runops_debug (dump.c:2045)
==17892== by 0x449D89: S_run_body (perl.c:2302)
==17892== by 0x449274: perl_run (perl.c:2227)
==17892== by 0x41FA53: main (perlmain.c:117)
==17892== Address 0x5f2fac8 is 0 bytes after a block of size 120 alloc'd
==17892== at 0x4C25153: malloc (vg_replace_malloc.c:195)
==17892== by 0x4F6DA5: Perl_safesysmalloc (util.c:94)
==17892== by 0x55248D: Perl_sv_grow (sv.c:1559)
==17892== by 0x57776E: Perl_newSV (sv.c:4883)
==17892== by 0x607A8BF: encode_method (Encode.xs:105)
==17892== by 0x6081938: XS_Encode__XS_encode (Encode.xs:632)
==17892== by 0x54B86B: Perl_pp_entersub (pp_hot.c:2875)
==17892== by 0x4F64DE: Perl_runops_debug (dump.c:2045)
==17892== by 0x449D89: S_run_body (perl.c:2302)
==17892== by 0x449274: perl_run (perl.c:2227)
==17892== by 0x41FA53: main (perlmain.c:117)

--
George Greer


dankogai at dan

Nov 15, 2009, 9:33 PM

Post #2 of 5 (347 views)
Permalink
Re: Encode memory corruption [perl #70528] [In reply to]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

George,

Thank you for your report.

On 16 Nov 2009, at 13:38, George Greer wrote:
> - - - 8< - - - 8< - - -
> use Encode qw[encode];
> encode("ISO-8859-1", "\x{b6} \x{b6} \x{b6} \x{b6} \x{b6} \x{b6} \x{2022}wwwww \x{2022}rrrrr uuu qqqqqqqqq \x{2022}yyyyyyy xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx \x{b6} \x{b6} \x{b6} \x{b6} \x{b6} \x{b6}", sub { "\x{2022}" });

It tries to return DURING string during encoding so the usage is wrong to begin with.
That being said, I successfully reproduced your case with the one-liner below.

perl -MEncode -le 'print encode "ascii", " a\x{b6}\x{2022}a"x8, sub{ "\x{2022}" }'

I also found this does not happen in Perl 5.8.9. So this has something to do with how Perl 5.10 allocates memory.

At any rate, ext/Encode/Encode.xs must be the file to look at.

Dan the Maintainer Thereof
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)

iEYEARECAAYFAksA5BYACgkQErJia/WXtBvMiwCdEJ6PbaD8XgC0vXCtL903wu3q
qMUAn1DuSBbgwol6qE5hHyYOxYd6jEGo
=4xqQ
-----END PGP SIGNATURE-----


perl at greerga

Nov 15, 2009, 9:46 PM

Post #3 of 5 (337 views)
Permalink
Re: Encode memory corruption [perl #70528] [In reply to]

On Mon, 16 Nov 2009, Dan Kogai wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> George,
>
> Thank you for your report.

And thank you for being so quick to respond.

> On 16 Nov 2009, at 13:38, George Greer wrote:
>> - - - 8< - - - 8< - - -
>> use Encode qw[encode];
>> encode("ISO-8859-1", "\x{b6} \x{b6} \x{b6} \x{b6} \x{b6} \x{b6} \x{2022}wwwww \x{2022}rrrrr uuu qqqqqqqqq \x{2022}yyyyyyy xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx \x{b6} \x{b6} \x{b6} \x{b6} \x{b6} \x{b6}", sub { "\x{2022}" });
>
> It tries to return DURING string during encoding so the usage is wrong to begin with.
> That being said, I successfully reproduced your case with the one-liner below.
>
> perl -MEncode -le 'print encode "ascii", " a\x{b6}\x{2022}a"x8, sub{ "\x{2022}" }'
>
> I also found this does not happen in Perl 5.8.9. So this has something to do with how Perl 5.10 allocates memory.
>
> At any rate, ext/Encode/Encode.xs must be the file to look at.

It might not crash under perl 5.8.9, but making it crash is finicky
anyway since the script doesn't exercise memory much afterward. Valgrind
says 5.8.9 still causes the errant write:

==30569== Command: /home/perl/work/cpan_maint-5.8/perl/bin/perl5.8.9
-MEncode -le print\ encode\ "ascii",\ "\ a\\x{b6}\\x{2022}a"x8,\
sub{"\\x{2022}"}
==30569==
==30569== Invalid write of size 1
==30569== at 0x629D113: do_encode (encengine.c:119)
==30569== by 0x62970B3: encode_method (Encode.xs:128)
==30569== by 0x629920D: XS_Encode__XS_encode (Encode.xs:621)
==30569== by 0x479D0F: Perl_pp_entersub (pp_hot.c:2862)
==30569== by 0x444896: Perl_runops_debug (dump.c:1639)
==30569== by 0x465582: S_run_body (perl.c:2453)
==30569== by 0x464E77: perl_run (perl.c:2368)
==30569== by 0x421CA8: main (perlmain.c:109)
==30569== Address 0x61bb4c0 is 0 bytes after a block of size 48 alloc'd
==30569== at 0x4C2524D: realloc (vg_replace_malloc.c:476)
==30569== by 0x4451F1: Perl_safesysrealloc (util.c:177)
==30569== by 0x47D440: Perl_sv_grow (sv.c:1440)
==30569== by 0x48445A: Perl_sv_catpvn_flags (sv.c:3915)
==30569== by 0x484752: Perl_sv_catsv_flags (sv.c:3975)
==30569== by 0x6296D7A: encode_method (Encode.xs:204)
==30569== by 0x629920D: XS_Encode__XS_encode (Encode.xs:621)
==30569== by 0x479D0F: Perl_pp_entersub (pp_hot.c:2862)
==30569== by 0x444896: Perl_runops_debug (dump.c:1639)
==30569== by 0x465582: S_run_body (perl.c:2453)
==30569== by 0x464E77: perl_run (perl.c:2368)
==30569== by 0x421CA8: main (perlmain.c:109)

perl 5.10.0 crashed a lot more than blead 5.11.1 during my test case
reduction, but the valgrind still showed the write being there even when
blead didn't crash.

- - - 8< - - - 8< - - -
Summary of my perl5 (revision 5 version 8 subversion 9 patch 35104) configuration:
Platform:
osname=linux, osvers=2.6.28-15-generic, archname=x86_64-linux-thread-multi
uname='linux zwei 2.6.28-15-generic #49-ubuntu smp tue aug 18 19:25:34 utc 2009 x86_64 gnulinux '
config_args='-Dusedevel -Dusethreads -Dinstallbin -Duse64bitall -Dprefix=/home/perl/work/cpan_maint-5.8/perl -Doptimize=-g -des'
- - - 8< - - - 8< - - -

--
George Greer


dankogai at dan

Nov 16, 2009, 12:23 AM

Post #4 of 5 (339 views)
Permalink
Re: Encode memory corruption [perl #70528] [In reply to]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

George,

I think I have fixed it now.

On 16 Nov 2009, at 14:46, George Greer wrote:
> It might not crash under perl 5.8.9, but making it crash is finicky anyway since the script doesn't exercise memory much afterward. Valgrind says 5.8.9 still causes the errant write:
>
> ==30569== Command: /home/perl/work/cpan_maint-5.8/perl/bin/perl5.8.9 -MEncode -le print\ encode\ "ascii",\ "\ a\\x{b6}\\x{2022}a"x8,\ sub{"\\x{2022}"}
> ==30569==
> ==30569== Invalid write of size 1
> ==30569== at 0x629D113: do_encode (encengine.c:119)
> ==30569== by 0x62970B3: encode_method (Encode.xs:128)
> ==30569== by 0x629920D: XS_Encode__XS_encode (Encode.xs:621)
> ==30569== by 0x479D0F: Perl_pp_entersub (pp_hot.c:2862)
> ==30569== by 0x444896: Perl_runops_debug (dump.c:1639)
> ==30569== by 0x465582: S_run_body (perl.c:2453)
> ==30569== by 0x464E77: perl_run (perl.c:2368)
> ==30569== by 0x421CA8: main (perlmain.c:109)
> ==30569== Address 0x61bb4c0 is 0 bytes after a block of size 48 alloc'd
> ==30569== at 0x4C2524D: realloc (vg_replace_malloc.c:476)
> ==30569== by 0x4451F1: Perl_safesysrealloc (util.c:177)
> ==30569== by 0x47D440: Perl_sv_grow (sv.c:1440)
> ==30569== by 0x48445A: Perl_sv_catpvn_flags (sv.c:3915)
> ==30569== by 0x484752: Perl_sv_catsv_flags (sv.c:3975)
> ==30569== by 0x6296D7A: encode_method (Encode.xs:204)
> ==30569== by 0x629920D: XS_Encode__XS_encode (Encode.xs:621)
> ==30569== by 0x479D0F: Perl_pp_entersub (pp_hot.c:2862)
> ==30569== by 0x444896: Perl_runops_debug (dump.c:1639)
> ==30569== by 0x465582: S_run_body (perl.c:2453)
> ==30569== by 0x464E77: perl_run (perl.c:2368)
> ==30569== by 0x421CA8: main (perlmain.c:109)
>
> perl 5.10.0 crashed a lot more than blead 5.11.1 during my test case reduction, but the valgrind still showed the write being there even when blead didn't crash.

Would you try the patch below? That fixed the problem on my OS X.

====
% perl -MEncode -le 'print encode "ascii", " a\x{b6}\x{2022}a"x8, sub{ "\x{2022}" }'
Segmentation fault
% perl -Mblib -MEncode -le 'print encode "ascii", " a\x{b6}\x{2022}a"x8, sub{ "\x{2022}" }'
a••a a••a a••a a••a a••a a••a a••a a••a
====

The patch applies SvUTF8_off when encoding. I also did a little optimization but that does not matter on fixing the problem.

I will VERSION++ after your report. Thank you in advance for testing.

Dan the Maintainer THereof.

===================================================================
RCS file: Encode.xs,v
retrieving revision 2.16
diff -u -r2.16 Encode.xs
- --- Encode.xs 2009/09/06 14:32:21 2.16
+++ Encode.xs 2009/11/16 08:17:11
@@ -68,7 +68,7 @@
{
dSP;
int argc;
- - SV *temp, *retval;
+ SV *retval = newSVpv("",0);
ENTER;
SAVETMPS;
PUSHMARK(sp);
@@ -79,13 +79,10 @@
if (argc != 1){
croak("fallback sub must return scalar!");
}
- - temp = newSVsv(POPs);
+ sv_catsv(retval, POPs);
PUTBACK;
FREETMPS;
LEAVE;
- - retval = newSVpv("",0);
- - sv_catsv(retval, temp);
- - SvREFCNT_dec(temp);
return retval;
}

@@ -199,6 +196,7 @@
: newSVpvf(check & ENCODE_PERLQQ ? "\\x{%04"UVxf"}" :
check & ENCODE_HTMLCREF ? "&#%" UVuf ";" :
"&#x%" UVxf ";", (UV)ch);
+ SvUTF8_off(subchar); /* make sure no decoded string gets in */
sdone += slen + clen;
ddone += dlen + SvCUR(subchar);
sv_catsv(dst, subchar);

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)

iEYEARECAAYFAksBDBsACgkQErJia/WXtBuXBgCdEvbSBofXhu+DlP6qm6mo6ZJW
HUwAnjIAj+daYPByCbCd0ST28PDoSpkA
=84SB
-----END PGP SIGNATURE-----


perl at greerga

Nov 16, 2009, 4:12 PM

Post #5 of 5 (338 views)
Permalink
Re: Encode memory corruption [perl #70528] [In reply to]

On Mon, 16 Nov 2009, Dan Kogai wrote:

> George,
>
> I think I have fixed it now.

Yes, Valgrind confirms.

> I will VERSION++ after your report. Thank you in advance for testing.

Thanks! Obviously trying to give UTF-8 as a fallback character was
unintentional, but it turned out more interesting than I wanted.

--
George Greer

Perl porters RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.