Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Perl: porters

Eval is deadly slow

 

 

First page Previous page 1 2 Next page Last page  View All Perl porters RSS feed   Index | Next | Previous | View Threaded


demerphq at gmail

Nov 2, 2009, 5:31 AM

Post #1 of 37 (690 views)
Permalink
Eval is deadly slow

I have over the past months a number of times observed that eval is
*deadly* slow.

In particular using eval to load data structures is extremely slow.

And the performance degrades substantially the more strings there are
and the longer they are.

Ive been thinking that perhaps we could really benefit by reexamining
the string parsing logic to see if it is possible to speed it up, for
instance the documentation claims we scan a string something like 3
times in order to parse it out. Perhaps we can cut down on this, or
make it more efficient.

Any thoughts? In particular does this sound worth while?

I mean, i replaced some code recently that used eval to load some data
structures to using split instead, with the string parsing and
unescaping done manually. The result was many orders of magnitude
faster. I would expect eval to be slower, but not hundreds of times
slower.

cheers,
Yves


--
perl -Mre=debug -e "/just|another|perl|hacker/"


rgs at consttype

Nov 2, 2009, 5:38 AM

Post #2 of 37 (672 views)
Permalink
Re: Eval is deadly slow [In reply to]

2009/11/2 demerphq <demerphq [at] gmail>:
> I have over the past months a number of times observed that eval is
> *deadly* slow.
>
> In particular using eval to load data structures is extremely slow.
>
> And the performance degrades substantially the more strings there are
> and the longer they are.
>
> Ive been thinking that perhaps we could really benefit by reexamining
> the string parsing logic to see if it is possible to speed it up, for
> instance the documentation claims we scan a string something like 3
> times in order to parse it out. Perhaps we can cut down on this, or
> make it more efficient.
>
> Any thoughts? In particular does this sound worth while?

That does, however, the code being what it is, that is,
euphemistically convoluted, I'd start with a profiling tool. I
wouldn't know where to start.


demerphq at gmail

Nov 2, 2009, 5:43 AM

Post #3 of 37 (673 views)
Permalink
Re: Eval is deadly slow [In reply to]

2009/11/2 Rafael Garcia-Suarez <rgs [at] consttype>:
> 2009/11/2 demerphq <demerphq [at] gmail>:
>> I have over the past months a number of times observed that eval is
>> *deadly* slow.
>>
>> In particular using eval to load data structures is extremely slow.
>>
>> And the performance degrades substantially the more strings there are
>> and the longer they are.
>>
>> Ive been thinking that perhaps we could really benefit by reexamining
>> the string parsing logic to see if it is possible to speed it up, for
>> instance the documentation claims we scan a string something like 3
>> times in order to parse it out. Perhaps we can cut down on this, or
>> make it more efficient.
>>
>> Any thoughts? In particular does this sound worth while?
>
> That does, however, the code being what it is, that is,
> euphemistically convoluted, I'd start with a profiling tool. I
> wouldn't know where to start.

I did some performance analysis of this, and it was pretty shocking.
As the length of the strings increases the time it takes to compile
degrades faster. So for instance double the length of the string and
you more than double the time it takes to parse them (my mathemtically
analysis skills arent strong enough for me to work out what the exact
relationship is). And that is for strings with NO escapes or embedded
constructs.

Yves




--
perl -Mre=debug -e "/just|another|perl|hacker/"


rvtol+usenet at isolution

Nov 2, 2009, 11:22 AM

Post #4 of 37 (668 views)
Permalink
Re: Eval is deadly slow [In reply to]

demerphq wrote:

> I have over the past months a number of times observed that eval is
> *deadly* slow.
>
> In particular using eval to load data structures is extremely slow.
>
> And the performance degrades substantially the more strings there are
> and the longer they are.
>
> Ive been thinking that perhaps we could really benefit by reexamining
> the string parsing logic to see if it is possible to speed it up, for
> instance the documentation claims we scan a string something like 3
> times in order to parse it out. Perhaps we can cut down on this, or
> make it more efficient.
>
> Any thoughts? In particular does this sound worth while?
>
> I mean, i replaced some code recently that used eval to load some data
> structures to using split instead, with the string parsing and
> unescaping done manually. The result was many orders of magnitude
> faster. I would expect eval to be slower, but not hundreds of times
> slower.

Zefram recently mentioned Parse::Perl

http://search.cpan.org/~zefram/Parse-Perl

--
Ruud


zefram at fysh

Nov 3, 2009, 12:11 PM

Post #5 of 37 (681 views)
Permalink
Re: Eval is deadly slow [In reply to]

Dr.Ruud wrote:
>Zefram recently mentioned Parse::Perl

That won't solve the problem of eval being slow, because it internally
calls the main Perl parser, much like eval does. Its main improvement
over eval is in composability, in that it separates parsing from execution
(and a couple of other bits). I don't think any of its advantages are
relevant when parsing for data structures: it's all about code.

If you want to represent data structures, then anything that works
by parsing code that then has to be executed to get the data is
going to be relatively slow. eval and Parse::Perl both go that way.
You want something that's specialised for representing data structures.
You might want to look at my Data::Pond module, which uses a subset
of Perl syntax and a specialised parser, but I'd be surprised if it's
faster than Storable or other binary formats.

-zefram


demerphq at gmail

Nov 3, 2009, 12:18 PM

Post #6 of 37 (664 views)
Permalink
Re: Eval is deadly slow [In reply to]

2009/11/3 Zefram <zefram [at] fysh>:
> Dr.Ruud wrote:
>>Zefram recently mentioned Parse::Perl
>
> That won't solve the problem of eval being slow, because it internally
> calls the main Perl parser, much like eval does.  Its main improvement
> over eval is in composability, in that it separates parsing from execution
> (and a couple of other bits).  I don't think any of its advantages are
> relevant when parsing for data structures: it's all about code.
>
> If you want to represent data structures, then anything that works
> by parsing code that then has to be executed to get the data is
> going to be relatively slow.  eval and Parse::Perl both go that way.
> You want something that's specialised for representing data structures.
> You might want to look at my Data::Pond module, which uses a subset
> of Perl syntax and a specialised parser, but I'd be surprised if it's
> faster than Storable or other binary formats.

I agree it is always going to be faster to use a customized tool, but
the results of eval are just *horrible*.

Yves



--
perl -Mre=debug -e "/just|another|perl|hacker/"


Tim.Bunce at pobox

Nov 3, 2009, 1:42 PM

Post #7 of 37 (678 views)
Permalink
Re: Eval is deadly slow [In reply to]

On Mon, Nov 02, 2009 at 02:43:52PM +0100, demerphq wrote:
> 2009/11/2 Rafael Garcia-Suarez <rgs [at] consttype>:
> > 2009/11/2 demerphq <demerphq [at] gmail>:
> >> I have over the past months a number of times observed that eval is
> >> *deadly* slow.
> >>
> >> In particular using eval to load data structures is extremely slow.
> >>
> >> And the performance degrades substantially the more strings there are
> >> and the longer they are.
> >>
> >> Ive been thinking that perhaps we could really benefit by reexamining
> >> the string parsing logic to see if it is possible to speed it up, for
> >> instance the documentation claims we scan a string something like 3
> >> times in order to parse it out. Perhaps we can cut down on this, or
> >> make it more efficient.
> >>
> >> Any thoughts? In particular does this sound worth while?
> >
> > That does, however, the code being what it is, that is,
> > euphemistically convoluted, I'd start with a profiling tool. I
> > wouldn't know where to start.
>
> I did some performance analysis of this, and it was pretty shocking.
> As the length of the strings increases the time it takes to compile
> degrades faster. So for instance double the length of the string and
> you more than double the time it takes to parse them (my mathemtically
> analysis skills arent strong enough for me to work out what the exact
> relationship is). And that is for strings with NO escapes or embedded
> constructs.

Can you post some code to demonstrate?

Tim.


demerphq at gmail

Nov 3, 2009, 2:58 PM

Post #8 of 37 (668 views)
Permalink
Re: Eval is deadly slow [In reply to]

2009/11/3 Tim Bunce <Tim.Bunce [at] pobox>:
> On Mon, Nov 02, 2009 at 02:43:52PM +0100, demerphq wrote:
>> 2009/11/2 Rafael Garcia-Suarez <rgs [at] consttype>:
>> > 2009/11/2 demerphq <demerphq [at] gmail>:
>> >> I have over the past months a number of times observed that eval is
>> >> *deadly* slow.
>> >>
>> >> In particular using eval to load data structures is extremely slow.
>> >>
>> >> And the performance degrades substantially the more strings there are
>> >> and the longer they are.
>> >>
>> >> Ive been thinking that perhaps we could really benefit by reexamining
>> >> the string parsing logic to see if it is possible to speed it up, for
>> >> instance the documentation claims we scan a string something like 3
>> >> times in order to parse it out. Perhaps we can cut down on this, or
>> >> make it more efficient.
>> >>
>> >> Any thoughts? In particular does this sound worth while?
>> >
>> > That does, however, the code being what it is, that is,
>> > euphemistically convoluted, I'd start with a profiling tool. I
>> > wouldn't know where to start.
>>
>> I did some performance analysis of this, and it was pretty shocking.
>> As the length of the strings increases the time it takes to compile
>> degrades faster. So for instance double the length of the string and
>> you more than double the time it takes to parse them (my mathemtically
>> analysis skills arent strong enough for me to work out what the exact
>> relationship is). And that is for strings with NO escapes or embedded
>> constructs.
>
> Can you post some code to demonstrate?

use strict;
use warnings;
use Benchmark qw(cmpthese timethese);
my %all;
for my $len (1,10,50,100,250,500,750,1000,5000,10000) {
my @s;
my $code1= "sub { \nmy \$out='';\n";
my $code2= "sub { \nmy \$o='';\n";
for (1..1000) {
my $str="x" x $len;
push @s, $str;
$code1 .= qq( \$out .= "$str";\n);
$code2 .= qq( \$o .= \$s[$#s];\n);
}
$_.= "}\n" for $code1,$code2;
$code2=pack"(Z*)*",$code2,@s;
my $subs= {
"long" => sub { eval $code1 or die "code1 died: $@" },
"short" => sub {
#my ($c,@s)=unpack"(Z*)*",$code2;
my ($c,@s)=split /\0/, $code2;
eval $c or die "code2 died: $@";
},
};
my @res;
foreach my $sub (qw(short long)) {
push @res, $subs->{$sub}->()->();
}
use Data::Dumper;
die Dumper(\@res) if !$res[0] or !$res[1] or $res[0] ne $res[1];
print "Timing length $len\n";
$all{$len}= timethese(-1,$subs);
}

foreach my $len (sort {$a <=> $b} keys %all) {
my $t= $all{$len};
print join("\t", $len,
map {
sprintf "%.2f\t%d",
$t->{$_}->iters / $t->{$_}->cpu_a,
$t->{$_}->iters
} qw(short long)
),
"\n",
;
}

__END__

Timing length 1
Benchmark: running long, short for at least 1 CPU seconds...
long: 1 wallclock secs ( 1.03 usr + 0.00 sys = 1.03 CPU) @
271.84/s (n=280)
short: 1 wallclock secs ( 1.12 usr + 0.00 sys = 1.12 CPU) @
214.29/s (n=240)
Timing length 10
Benchmark: running long, short for at least 1 CPU seconds...
long: 1 wallclock secs ( 1.10 usr + 0.01 sys = 1.11 CPU) @
252.25/s (n=280)
short: 1 wallclock secs ( 1.12 usr + 0.01 sys = 1.13 CPU) @
211.50/s (n=239)
Timing length 50
Benchmark: running long, short for at least 1 CPU seconds...
long: 1 wallclock secs ( 1.12 usr + 0.00 sys = 1.12 CPU) @
230.36/s (n=258)
short: 2 wallclock secs ( 1.08 usr + 0.00 sys = 1.08 CPU) @
193.52/s (n=209)
Timing length 100
Benchmark: running long, short for at least 1 CPU seconds...
long: 1 wallclock secs ( 1.10 usr + 0.01 sys = 1.11 CPU) @
200.90/s (n=223)
short: 1 wallclock secs ( 1.13 usr + 0.01 sys = 1.14 CPU) @
211.40/s (n=241)
Timing length 250
Benchmark: running long, short for at least 1 CPU seconds...
long: 1 wallclock secs ( 1.10 usr + 0.00 sys = 1.10 CPU) @
173.64/s (n=191)
short: 1 wallclock secs ( 1.17 usr + 0.00 sys = 1.17 CPU) @
178.63/s (n=209)
Timing length 500
Benchmark: running long, short for at least 1 CPU seconds...
long: 2 wallclock secs ( 1.10 usr + 0.01 sys = 1.11 CPU) @
121.62/s (n=135)
short: 1 wallclock secs ( 1.16 usr + 0.01 sys = 1.17 CPU) @
150.43/s (n=176)
Timing length 750
Benchmark: running long, short for at least 1 CPU seconds...
long: 1 wallclock secs ( 1.03 usr + 0.01 sys = 1.04 CPU) @
89.42/s (n=93)
short: 1 wallclock secs ( 1.04 usr + 0.00 sys = 1.04 CPU) @
133.65/s (n=139)
Timing length 1000
Benchmark: running long, short for at least 1 CPU seconds...
long: 1 wallclock secs ( 0.75 usr + 0.32 sys = 1.07 CPU) @
45.79/s (n=49)
short: 1 wallclock secs ( 1.04 usr + 0.12 sys = 1.16 CPU) @
120.69/s (n=140)
Timing length 5000
Benchmark: running long, short for at least 1 CPU seconds...
long: 1 wallclock secs ( 0.84 usr + 0.20 sys = 1.04 CPU) @
15.38/s (n=16)
short: 1 wallclock secs ( 0.87 usr + 0.19 sys = 1.06 CPU) @
41.51/s (n=44)
Timing length 10000
Benchmark: running long, short for at least 1 CPU seconds...
long: 1 wallclock secs ( 0.81 usr + 0.20 sys = 1.01 CPU) @
8.91/s (n=9)
short: 1 wallclock secs ( 0.80 usr + 0.22 sys = 1.02 CPU) @
26.47/s (n=27)
1 214.29 240 271.84 280
10 211.50 239 252.25 280
50 193.52 209 230.36 258
100 211.40 241 200.90 223
250 178.63 209 173.64 191
500 150.43 176 121.62 135
750 133.65 139 89.42 93
1000 120.69 140 45.79 49
5000 41.51 44 15.38 16
10000 26.47 27 8.91 9


--
perl -Mre=debug -e "/just|another|perl|hacker/"


demerphq at gmail

Nov 3, 2009, 3:11 PM

Post #9 of 37 (664 views)
Permalink
Re: Eval is deadly slow [In reply to]

2009/11/3 demerphq <demerphq [at] gmail>:
> 2009/11/3 Tim Bunce <Tim.Bunce [at] pobox>:
>> On Mon, Nov 02, 2009 at 02:43:52PM +0100, demerphq wrote:
>>> 2009/11/2 Rafael Garcia-Suarez <rgs [at] consttype>:
>>> > 2009/11/2 demerphq <demerphq [at] gmail>:
>>> >> I have over the past months a number of times observed that eval is
>>> >> *deadly* slow.
>>> >>
>>> >> In particular using eval to load data structures is extremely slow.
>>> >>
>>> >> And the performance degrades substantially the more strings there are
>>> >> and the longer they are.
>>> >>
>>> >> Ive been thinking that perhaps we could really benefit by reexamining
>>> >> the string parsing logic to see if it is possible to speed it up, for
>>> >> instance the documentation claims we scan a string something like 3
>>> >> times in order to parse it out. Perhaps we can cut down on this, or
>>> >> make it more efficient.
>>> >>
>>> >> Any thoughts? In particular does this sound worth while?
>>> >
>>> > That does, however, the code being what it is, that is,
>>> > euphemistically convoluted, I'd start with a profiling tool. I
>>> > wouldn't know where to start.
>>>
>>> I did some performance analysis of this, and it was pretty shocking.
>>> As the length of the strings increases the time it takes to compile
>>> degrades faster. So for instance double the length of the string and
>>> you more than double the time it takes to parse them (my mathemtically
>>> analysis skills arent strong enough for me to work out what the exact
>>> relationship is). And that is for strings with NO escapes or embedded
>>> constructs.
>>
>> Can you post some code to demonstrate?


[snip] code

The code constructs two subs whose job it is is to concatenate 1000
strings of varying length repititions of the character "x".

The subs are evalled into existance. One sub consists of plain
concatenation, and the other uses a more complex approach, where the
strings in the sub are replaced by array lookups and the strings are
stored together with the sub in a null separated string. The idea
being that in one case we do:

my $sub= eval $code;

and the other we do:

my ($code,@strings)= split /\0/, $_;
my $sub=eval $code;

The results show that for small strings, plain eval wins. As the
strings start getting longer the result is a clear win for the
split/eval solution, and it decays considerably as the strings get
longer. By the time the strings are 10k, the results are quite
devastating for eval with the eval/split combined approach being more
than 3 times faster than the eval.

> Timing length 1
> Benchmark: running long, short for at least 1 CPU seconds...
>      long:  1 wallclock secs ( 1.03 usr +  0.00 sys =  1.03 CPU) @
> 271.84/s (n=280)
>     short:  1 wallclock secs ( 1.12 usr +  0.00 sys =  1.12 CPU) @
> 214.29/s (n=240)
> Timing length 10
> Benchmark: running long, short for at least 1 CPU seconds...
>      long:  1 wallclock secs ( 1.10 usr +  0.01 sys =  1.11 CPU) @
> 252.25/s (n=280)
>     short:  1 wallclock secs ( 1.12 usr +  0.01 sys =  1.13 CPU) @
> 211.50/s (n=239)
> Timing length 50
> Benchmark: running long, short for at least 1 CPU seconds...
>      long:  1 wallclock secs ( 1.12 usr +  0.00 sys =  1.12 CPU) @
> 230.36/s (n=258)
>     short:  2 wallclock secs ( 1.08 usr +  0.00 sys =  1.08 CPU) @
> 193.52/s (n=209)
> Timing length 100
> Benchmark: running long, short for at least 1 CPU seconds...
>      long:  1 wallclock secs ( 1.10 usr +  0.01 sys =  1.11 CPU) @
> 200.90/s (n=223)
>     short:  1 wallclock secs ( 1.13 usr +  0.01 sys =  1.14 CPU) @
> 211.40/s (n=241)
> Timing length 250
> Benchmark: running long, short for at least 1 CPU seconds...
>      long:  1 wallclock secs ( 1.10 usr +  0.00 sys =  1.10 CPU) @
> 173.64/s (n=191)
>     short:  1 wallclock secs ( 1.17 usr +  0.00 sys =  1.17 CPU) @
> 178.63/s (n=209)
> Timing length 500
> Benchmark: running long, short for at least 1 CPU seconds...
>      long:  2 wallclock secs ( 1.10 usr +  0.01 sys =  1.11 CPU) @
> 121.62/s (n=135)
>     short:  1 wallclock secs ( 1.16 usr +  0.01 sys =  1.17 CPU) @
> 150.43/s (n=176)
> Timing length 750
> Benchmark: running long, short for at least 1 CPU seconds...
>      long:  1 wallclock secs ( 1.03 usr +  0.01 sys =  1.04 CPU) @
> 89.42/s (n=93)
>     short:  1 wallclock secs ( 1.04 usr +  0.00 sys =  1.04 CPU) @
> 133.65/s (n=139)
> Timing length 1000
> Benchmark: running long, short for at least 1 CPU seconds...
>      long:  1 wallclock secs ( 0.75 usr +  0.32 sys =  1.07 CPU) @
> 45.79/s (n=49)
>     short:  1 wallclock secs ( 1.04 usr +  0.12 sys =  1.16 CPU) @
> 120.69/s (n=140)
> Timing length 5000
> Benchmark: running long, short for at least 1 CPU seconds...
>      long:  1 wallclock secs ( 0.84 usr +  0.20 sys =  1.04 CPU) @
> 15.38/s (n=16)
>     short:  1 wallclock secs ( 0.87 usr +  0.19 sys =  1.06 CPU) @
> 41.51/s (n=44)
> Timing length 10000
> Benchmark: running long, short for at least 1 CPU seconds...
>      long:  1 wallclock secs ( 0.81 usr +  0.20 sys =  1.01 CPU) @
> 8.91/s (n=9)
>     short:  1 wallclock secs ( 0.80 usr +  0.22 sys =  1.02 CPU) @
> 26.47/s (n=27)
> 1       214.29  240     271.84  280
> 10      211.50  239     252.25  280
> 50      193.52  209     230.36  258
> 100     211.40  241     200.90  223
> 250     178.63  209     173.64  191
> 500     150.43  176     121.62  135
> 750     133.65  139     89.42   93
> 1000    120.69  140     45.79   49
> 5000    41.51   44      15.38   16
> 10000   26.47   27      8.91    9
>



--
perl -Mre=debug -e "/just|another|perl|hacker/"


Tim.Bunce at pobox

Nov 4, 2009, 6:08 AM

Post #10 of 37 (660 views)
Permalink
Re: Eval is deadly slow [In reply to]

On Mon, Nov 02, 2009 at 02:43:52PM +0100, demerphq wrote:
> 2009/11/2 Rafael Garcia-Suarez <rgs [at] consttype>:
> > 2009/11/2 demerphq <demerphq [at] gmail>:
> >> I have over the past months a number of times observed that eval is
> >> *deadly* slow.
> >>
> >> In particular using eval to load data structures is extremely slow.
> >>
> >> And the performance degrades substantially the more strings there are
> >> and the longer they are.
> >>
> >> Ive been thinking that perhaps we could really benefit by reexamining
> >> the string parsing logic to see if it is possible to speed it up, for
> >> instance the documentation claims we scan a string something like 3
> >> times in order to parse it out. Perhaps we can cut down on this, or
> >> make it more efficient.
> >>
> >> Any thoughts? In particular does this sound worth while?
> >
> > That does, however, the code being what it is, that is,
> > euphemistically convoluted, I'd start with a profiling tool. I
> > wouldn't know where to start.
>
> I did some performance analysis of this, and it was pretty shocking.
> As the length of the strings increases the time it takes to compile
> degrades faster. So for instance double the length of the string and
> you more than double the time it takes to parse them (my mathemtically
> analysis skills arent strong enough for me to work out what the exact
> relationship is). And that is for strings with NO escapes or embedded
> constructs.

Just so we're clear, we're talking about the time to compile code which
contains very long literal strings:

$a .= "xxxxxxxxxxxxxxxxxxxxxxx ... xxxxxxxxxxxxxxxxxxx";
$a .= "xxxxxxxxxxxxxxxxxxxxxxx ... xxxxxxxxxxxxxxxxxxx";
$a .= "xxxxxxxxxxxxxxxxxxxxxxx ... xxxxxxxxxxxxxxxxxxx";
...

I tweaked the code to just compile the large example with big strings
and then used the Sample Process option of the OS X Activity Monitor to
give me a quick profile.

For a vanilla perl 5.10 (not debugging or threads):

~32% of the time was spent in S_scan_str and below, with:
~2% in Perl_sv_grow
~19% of the time was spent in S_sublex_start and below, with:
~10% in Perl_newSVpvn
~5% in Perl_sv_free
~4% in S_tokeq
~16% of the time was spent in Perl_yylex

Changing the code to use single quotes instead of double
made it run ~20% faster.

~59% of the time was spent in S_scan_str and below, with:
~5% in Perl_sv_grow
~39% of the time was spent in S_sublex_start and below, with:
~18% in Perl_newSVpvn
~11% in S_tokeq
~9% in Perl_sv_free

Tim.


demerphq at gmail

Nov 4, 2009, 6:24 AM

Post #11 of 37 (668 views)
Permalink
Re: Eval is deadly slow [In reply to]

2009/11/4 Tim Bunce <Tim.Bunce [at] pobox>:
> On Mon, Nov 02, 2009 at 02:43:52PM +0100, demerphq wrote:
>> 2009/11/2 Rafael Garcia-Suarez <rgs [at] consttype>:
>> > 2009/11/2 demerphq <demerphq [at] gmail>:
>> >> I have over the past months a number of times observed that eval is
>> >> *deadly* slow.
>> >>
>> >> In particular using eval to load data structures is extremely slow.
>> >>
>> >> And the performance degrades substantially the more strings there are
>> >> and the longer they are.
>> >>
>> >> Ive been thinking that perhaps we could really benefit by reexamining
>> >> the string parsing logic to see if it is possible to speed it up, for
>> >> instance the documentation claims we scan a string something like 3
>> >> times in order to parse it out. Perhaps we can cut down on this, or
>> >> make it more efficient.
>> >>
>> >> Any thoughts? In particular does this sound worth while?
>> >
>> > That does, however, the code being what it is, that is,
>> > euphemistically convoluted, I'd start with a profiling tool. I
>> > wouldn't know where to start.
>>
>> I did some performance analysis of this, and it was pretty shocking.
>> As the length of the strings increases the time it takes to compile
>> degrades faster. So for instance double the length of the string and
>> you more than double the time it takes to parse them (my mathemtically
>> analysis skills arent strong enough for me to work out what the exact
>> relationship is). And that is for strings with NO escapes or embedded
>> constructs.
>
> Just so we're clear, we're talking about the time to compile code which
> contains very long literal strings:
>
>    $a .= "xxxxxxxxxxxxxxxxxxxxxxx ... xxxxxxxxxxxxxxxxxxx";
>    $a .= "xxxxxxxxxxxxxxxxxxxxxxx ... xxxxxxxxxxxxxxxxxxx";
>    $a .= "xxxxxxxxxxxxxxxxxxxxxxx ... xxxxxxxxxxxxxxxxxxx";
>    ...

Well this benchmark was.

I have had lots of experience with eval being really slow.

This was just the closest to hand benchmark.

> I tweaked the code to just compile the large example with big strings
> and then used the Sample Process option of the OS X Activity Monitor to
> give me a quick profile.
>
> For a vanilla perl 5.10 (not debugging or threads):
>
> ~32% of the time was spent in S_scan_str and below, with:
>    ~2% in Perl_sv_grow
> ~19% of the time was spent in S_sublex_start and below, with:
>    ~10% in Perl_newSVpvn
>     ~5% in Perl_sv_free
>     ~4% in S_tokeq
> ~16% of the time was spent in Perl_yylex
>
> Changing the code to use single quotes instead of double
> made it run ~20% faster.

Interesting, I didnt look at that because i was assuming the data
would contain escapes etc.

(Although i didnt want to time the actual escape process itself).

But given neither quoting form contains escapes it suggests that the
dq case could be speeded up quite a bit.

> ~59% of the time was spent in S_scan_str and below, with:
>    ~5% in Perl_sv_grow
> ~39% of the time was spent in S_sublex_start and below, with:
>    ~18% in Perl_newSVpvn
>    ~11% in S_tokeq
>     ~9% in Perl_sv_free

So if S_scan_str is made faster we might see some speedup. Cool.

Now, why do i have a feeling that S_scan_str is going to be really scary :-)

yves


--
perl -Mre=debug -e "/just|another|perl|hacker/"


h.m.brand at xs4all

Nov 4, 2009, 6:26 AM

Post #12 of 37 (668 views)
Permalink
Re: Eval is deadly slow [In reply to]

On Wed, 4 Nov 2009 14:08:56 +0000, Tim Bunce <Tim.Bunce [at] pobox>
wrote:

> On Mon, Nov 02, 2009 at 02:43:52PM +0100, demerphq wrote:
> > 2009/11/2 Rafael Garcia-Suarez <rgs [at] consttype>:
> > > 2009/11/2 demerphq <demerphq [at] gmail>:
> > >> I have over the past months a number of times observed that eval is
> > >> *deadly* slow.
> > >>
> > >> In particular using eval to load data structures is extremely slow.
> > >>
> > >> And the performance degrades substantially the more strings there are
> > >> and the longer they are.
> > >>
> > >> Ive been thinking that perhaps we could really benefit by reexamining
> > >> the string parsing logic to see if it is possible to speed it up, for
> > >> instance the documentation claims we scan a string something like 3
> > >> times in order to parse it out. Perhaps we can cut down on this, or
> > >> make it more efficient.
> > >>
> > >> Any thoughts? In particular does this sound worth while?
> > >
> > > That does, however, the code being what it is, that is,
> > > euphemistically convoluted, I'd start with a profiling tool. I
> > > wouldn't know where to start.
> >
> > I did some performance analysis of this, and it was pretty shocking.
> > As the length of the strings increases the time it takes to compile
> > degrades faster. So for instance double the length of the string and
> > you more than double the time it takes to parse them (my mathemtically
> > analysis skills arent strong enough for me to work out what the exact
> > relationship is). And that is for strings with NO escapes or embedded
> > constructs.
>
> Just so we're clear, we're talking about the time to compile code which
> contains very long literal strings:
>
> $a .= "xxxxxxxxxxxxxxxxxxxxxxx ... xxxxxxxxxxxxxxxxxxx";
> $a .= "xxxxxxxxxxxxxxxxxxxxxxx ... xxxxxxxxxxxxxxxxxxx";
> $a .= "xxxxxxxxxxxxxxxxxxxxxxx ... xxxxxxxxxxxxxxxxxxx";
> ...
>
> I tweaked the code to just compile the large example with big strings
> and then used the Sample Process option of the OS X Activity Monitor to
> give me a quick profile.
>
> For a vanilla perl 5.10 (not debugging or threads):
>
> ~32% of the time was spent in S_scan_str and below, with:
> ~2% in Perl_sv_grow
> ~19% of the time was spent in S_sublex_start and below, with:
> ~10% in Perl_newSVpvn
> ~5% in Perl_sv_free
> ~4% in S_tokeq
> ~16% of the time was spent in Perl_yylex
>
> Changing the code to use single quotes instead of double
> made it run ~20% faster.

That might very much depend on architecture and the way perl was
compiled. I recently benchmarked a piece of code regarding sing;e vs
double quotes and the average diff was below 2%. That did NOT deal with
eval, so your findings might well be on-topic there.

> ~59% of the time was spent in S_scan_str and below, with:
> ~5% in Perl_sv_grow
> ~39% of the time was spent in S_sublex_start and below, with:
> ~18% in Perl_newSVpvn
> ~11% in S_tokeq
> ~9% in Perl_sv_free

--
H.Merijn Brand http://tux.nl Perl Monger http://amsterdam.pm.org/
using & porting perl 5.6.2, 5.8.x, 5.10.x, 5.11.x on HP-UX 10.20, 11.00,
11.11, 11.23, and 11.31, OpenSuSE 10.3, 11.0, and 11.1, AIX 5.2 and 5.3.
http://mirrors.develooper.com/hpux/ http://www.test-smoke.org/
http://qa.perl.org http://www.goldmark.org/jeff/stupid-disclaimers/


demerphq at gmail

Nov 4, 2009, 6:28 AM

Post #13 of 37 (655 views)
Permalink
Re: Eval is deadly slow [In reply to]

2009/11/4 H.Merijn Brand <h.m.brand [at] xs4all>:
> On Wed, 4 Nov 2009 14:08:56 +0000, Tim Bunce <Tim.Bunce [at] pobox>
> wrote:
>
>> On Mon, Nov 02, 2009 at 02:43:52PM +0100, demerphq wrote:
>> > 2009/11/2 Rafael Garcia-Suarez <rgs [at] consttype>:
>> > > 2009/11/2 demerphq <demerphq [at] gmail>:
>> > >> I have over the past months a number of times observed that eval is
>> > >> *deadly* slow.
>> > >>
>> > >> In particular using eval to load data structures is extremely slow.
>> > >>
>> > >> And the performance degrades substantially the more strings there are
>> > >> and the longer they are.
>> > >>
>> > >> Ive been thinking that perhaps we could really benefit by reexamining
>> > >> the string parsing logic to see if it is possible to speed it up, for
>> > >> instance the documentation claims we scan a string something like 3
>> > >> times in order to parse it out. Perhaps we can cut down on this, or
>> > >> make it more efficient.
>> > >>
>> > >> Any thoughts? In particular does this sound worth while?
>> > >
>> > > That does, however, the code being what it is, that is,
>> > > euphemistically convoluted, I'd start with a profiling tool. I
>> > > wouldn't know where to start.
>> >
>> > I did some performance analysis of this, and it was pretty shocking.
>> > As the length of the strings increases the time it takes to compile
>> > degrades faster. So for instance double the length of the string and
>> > you more than double the time it takes to parse them (my mathemtically
>> > analysis skills arent strong enough for me to work out what the exact
>> > relationship is). And that is for strings with NO escapes or embedded
>> > constructs.
>>
>> Just so we're clear, we're talking about the time to compile code which
>> contains very long literal strings:
>>
>>     $a .= "xxxxxxxxxxxxxxxxxxxxxxx ... xxxxxxxxxxxxxxxxxxx";
>>     $a .= "xxxxxxxxxxxxxxxxxxxxxxx ... xxxxxxxxxxxxxxxxxxx";
>>     $a .= "xxxxxxxxxxxxxxxxxxxxxxx ... xxxxxxxxxxxxxxxxxxx";
>>     ...
>>
>> I tweaked the code to just compile the large example with big strings
>> and then used the Sample Process option of the OS X Activity Monitor to
>> give me a quick profile.
>>
>> For a vanilla perl 5.10 (not debugging or threads):
>>
>> ~32% of the time was spent in S_scan_str and below, with:
>>     ~2% in Perl_sv_grow
>> ~19% of the time was spent in S_sublex_start and below, with:
>>     ~10% in Perl_newSVpvn
>>      ~5% in Perl_sv_free
>>      ~4% in S_tokeq
>> ~16% of the time was spent in Perl_yylex
>>
>> Changing the code to use single quotes instead of double
>> made it run ~20% faster.
>
> That might very much depend on architecture and the way perl was
> compiled. I recently benchmarked a piece of code regarding sing;e vs
> double quotes and the average diff was below 2%. That did NOT deal with
> eval, so your findings might well be on-topic there.

The length of the string is very important here as the graphs show. If
they are short i bet you see minimal difference.

cheers,
Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"


Tim.Bunce at pobox

Nov 4, 2009, 7:25 AM

Post #14 of 37 (666 views)
Permalink
Re: Eval is deadly slow [In reply to]

On Wed, Nov 04, 2009 at 03:26:22PM +0100, H.Merijn Brand wrote:
> >
> > Changing the code to use single quotes instead of double
> > made it run ~20% faster.
>
> That might very much depend on architecture and the way perl was
> compiled. I recently benchmarked a piece of code regarding sing;e vs
> double quotes and the average diff was below 2%. That did NOT deal with
> eval, so your findings might well be on-topic there.

This particular benchmark uses many very long strings and almost nothing
else - hence the more significant impact of single vs double quoting.

Tim.


Tom.Horsley at ccur

Nov 4, 2009, 10:45 AM

Post #15 of 37 (658 views)
Permalink
RE: Eval is deadly slow [In reply to]

> The length of the string is very important here as the graphs show. If
> they are short i bet you see minimal difference.

I know nothing about how perl works internally, but I've seen things like
this in many places besides perl where programs try to accumulate
string data and have to keep reallocing the accumulator to make it
bigger as the string continues to grow. Memory gets fragmented and
not reused because the string is getting bigger and none of the
old fragments are big enough, then when finally done, the program
starts over again and goes through the exact same thing on the
next string. Sometimes something simple like keeping track of
the biggest string seen, and preallocating that much space for
new strings can work wonders. Freeing extra space once at the
end is usually less overhead than constant reallocation during
the string creation.

Of course, this may be completely irrelevant, depending on how
perl actually parses strings today.


demerphq at gmail

Nov 4, 2009, 11:38 AM

Post #16 of 37 (651 views)
Permalink
Re: Eval is deadly slow [In reply to]

2009/11/4 Horsley, Tom <Tom.Horsley [at] ccur>:
>> The length of the string is very important here as the graphs show. If
>> they are short i bet you see minimal difference.
>
> I know nothing about how perl works internally, but I've seen things like
> this in many places besides perl where programs try to accumulate
> string data and have to keep reallocing the accumulator to make it
> bigger as the string continues to grow. Memory gets fragmented and
> not reused because the string is getting bigger and none of the
> old fragments are big enough, then when finally done, the program
> starts over again and goes through the exact same thing on the
> next string. Sometimes something simple like keeping track of
> the biggest string seen, and preallocating that much space for
> new strings can work wonders. Freeing extra space once at the
> end is usually less overhead than constant reallocation during
> the string creation.
>
> Of course, this may be completely irrelevant, depending on how
> perl actually parses strings today.

That looks to be the same process we are using. If my cursory analysis
is right we preallocate 80 bytes, then grow the string incrementally
after that.

cheers,
Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"


davidnicol at gmail

Nov 4, 2009, 11:43 AM

Post #17 of 37 (650 views)
Permalink
Re: Eval is deadly slow [In reply to]

On Wed, Nov 4, 2009 at 1:38 PM, demerphq <demerphq [at] gmail> wrote:
> That looks to be the same process we are using. If my cursory analysis
> is right we preallocate 80 bytes, then grow the string incrementally
> after that.
>
> cheers,
> Yves

making strings a more complex data structure then a length-associated
pointer to a buffer that is big enough would solve this and many other
issues; it would be however a very deep change.


rvtol+usenet at isolution

Nov 4, 2009, 11:54 AM

Post #18 of 37 (646 views)
Permalink
Re: Eval is deadly slow [In reply to]

demerphq wrote:

> the performance degrades substantially the more strings there are
> and the longer they are.

Maybe eval should be able to take on an array, or an arrayref.


This probably uses double the memory, might still be faster.

perl -we'
my @code;
push @code, "print 1", "print 2";
$" = ";";
eval "@code" or die;
'

--
Ruud


h.m.brand at xs4all

Nov 4, 2009, 11:58 AM

Post #19 of 37 (649 views)
Permalink
Re: Eval is deadly slow [In reply to]

On Wed, 4 Nov 2009 20:38:32 +0100, demerphq <demerphq [at] gmail> wrote:

> 2009/11/4 Horsley, Tom <Tom.Horsley [at] ccur>:
> >> The length of the string is very important here as the graphs show. If
> >> they are short i bet you see minimal difference.
> >
> > I know nothing about how perl works internally, but I've seen things like
> > this in many places besides perl where programs try to accumulate
> > string data and have to keep reallocing the accumulator to make it
> > bigger as the string continues to grow. Memory gets fragmented and
> > not reused because the string is getting bigger and none of the
> > old fragments are big enough, then when finally done, the program
> > starts over again and goes through the exact same thing on the
> > next string. Sometimes something simple like keeping track of
> > the biggest string seen, and preallocating that much space for
> > new strings can work wonders. Freeing extra space once at the
> > end is usually less overhead than constant reallocation during
> > the string creation.
> >
> > Of course, this may be completely irrelevant, depending on how
> > perl actually parses strings today.
>
> That looks to be the same process we are using. If my cursory analysis
> is right we preallocate 80 bytes, then grow the string incrementally
> after that.

IIRC I once suggested to allow length () to do preallocation:

$ perl -wle'length($a) = 8000'
Can't modify length in scalar assignment at -e line 1, at EOF
Execution of -e aborted due to compilation errors.

--
H.Merijn Brand http://tux.nl Perl Monger http://amsterdam.pm.org/
using & porting perl 5.6.2, 5.8.x, 5.10.x, 5.11.x on HP-UX 10.20, 11.00,
11.11, 11.23, and 11.31, OpenSuSE 10.3, 11.0, and 11.1, AIX 5.2 and 5.3.
http://mirrors.develooper.com/hpux/ http://www.test-smoke.org/
http://qa.perl.org http://www.goldmark.org/jeff/stupid-disclaimers/


davidnicol at gmail

Nov 4, 2009, 12:10 PM

Post #20 of 37 (664 views)
Permalink
Re: Eval is deadly slow [In reply to]

On Wed, Nov 4, 2009 at 1:58 PM, H.Merijn Brand <h.m.brand [at] xs4all> wrote:

> IIRC I once suggested to allow length () to do preallocation:
>
> $ perl -wle'length($a) = 8000'
> Can't modify length in scalar assignment at -e line 1, at EOF
> Execution of -e aborted due to compilation errors.

I like it! Since both keys(%h) and $#arr can be L-values, I would even
go so far as to call the failure of length($s) to follow suit a bug.
Would it truncate, though? would a new keyword that allows perl-side
access into SV implementation make more sense? A preallocate function
would be completely trivial wrapper around SvGROW.



--
"If you can't find someone on the ballot you believe in, I encourage
you to run and seek office." -- Dean Greco, All-Day Breakfast Party


ben at morrow

Nov 4, 2009, 12:59 PM

Post #21 of 37 (651 views)
Permalink
Re: Eval is deadly slow [In reply to]

Quoth h.m.brand [at] xs4all ("H.Merijn Brand"):
>
> IIRC I once suggested to allow length () to do preallocation:
>
> $ perl -wle'length($a) = 8000'
> Can't modify length in scalar assignment at -e line 1, at EOF
> Execution of -e aborted due to compilation errors.

~% perl -MDevel::Peek -e'my $x = ("x" x 10_000); $x = ""; Dump $x'
SV = PV(0x810109c) at 0x8103b10
REFCNT = 1
FLAGS = (PADMY,POK,pPOK)
PV = 0x8134004 ""\0
CUR = 0
LEN = 16380
~%

so while lvalue length would be a little neater, it's not strictly
necessary.

Ben


demerphq at gmail

Nov 4, 2009, 1:20 PM

Post #22 of 37 (660 views)
Permalink
Re: Eval is deadly slow [In reply to]

2009/11/4 Ben Morrow <ben [at] morrow>:
> Quoth h.m.brand [at] xs4all ("H.Merijn Brand"):
>>
>> IIRC I once suggested to allow length () to do preallocation:
>>
>> $ perl -wle'length($a) = 8000'
>> Can't modify length in scalar assignment at -e line 1, at EOF
>> Execution of -e aborted due to compilation errors.
>
>    ~% perl -MDevel::Peek -e'my $x = ("x" x 10_000); $x = ""; Dump $x'
>    SV = PV(0x810109c) at 0x8103b10
>      REFCNT = 1
>      FLAGS = (PADMY,POK,pPOK)
>      PV = 0x8134004 ""\0
>      CUR = 0
>      LEN = 16380
>    ~%
>
> so while lvalue length would be a little neater, it's not strictly
> necessary.

Also notice the preallocated space. It was asked for 10k it gave 1638.

cheers,
Yves



--
perl -Mre=debug -e "/just|another|perl|hacker/"


davidnicol at gmail

Nov 4, 2009, 1:52 PM

Post #23 of 37 (648 views)
Permalink
Re: Eval is deadly slow [In reply to]

On Wed, Nov 4, 2009 at 2:59 PM, Ben Morrow <ben [at] morrow> wrote:
>    ~% perl -MDevel::Peek -e'my $x = ("x" x 10_000); $x = ""; Dump $x'
>    SV = PV(0x810109c) at 0x8103b10
>      REFCNT = 1
>      FLAGS = (PADMY,POK,pPOK)
>      PV = 0x8134004 ""\0
>      CUR = 0
>      LEN = 16380
>    ~%
>
> so while lvalue length would be a little neater, it's not strictly
> necessary.
>
> Ben

Great! So l-value length is yet another suitable case for the Macro
treatment, when available.


--
"If you can't find someone on the ballot you believe in, I encourage
you to run and seek office." -- Dean Greco, All-Day Breakfast Party


h.m.brand at xs4all

Nov 4, 2009, 11:48 PM

Post #24 of 37 (644 views)
Permalink
Re: Eval is deadly slow [In reply to]

On Wed, 4 Nov 2009 20:59:50 +0000, Ben Morrow <ben [at] morrow> wrote:

> Quoth h.m.brand [at] xs4all ("H.Merijn Brand"):
> >
> > IIRC I once suggested to allow length () to do preallocation:
> >
> > $ perl -wle'length($a) = 8000'
> > Can't modify length in scalar assignment at -e line 1, at EOF
> > Execution of -e aborted due to compilation errors.
>
> ~% perl -MDevel::Peek -e'my $x = ("x" x 10_000); $x = ""; Dump $x'
> SV = PV(0x810109c) at 0x8103b10
> REFCNT = 1
> FLAGS = (PADMY,POK,pPOK)
> PV = 0x8134004 ""\0
> CUR = 0
> LEN = 16380
> ~%
>
> so while lvalue length would be a little neater, it's not strictly
> necessary.

There is quite a big difference here. 'x' x 10_000 actually allocates a
temporary space for the value to be used to initialize $x with, then
copies the variable and resets the length afterwards. A lot of unneeded
ops.

$ perl -MO=Deparse -we'my $x = ("x" x 10_000); $x = ""'
BEGIN { $^W = 1; }
my $x = 'x' x 10000;
$x = '';
-e syntax OK
$ perl -MO=Concise -we'my $x = ("x" x 10_000); $x = ""'
c <@> leave[1 ref] vKP/REFC ->(end)
1 <0> enter ->2
2 <;> nextstate(main 1 -e:1) v:{ ->3
7 <2> sassign vKS/2 ->8
5 <2> repeat[t2] sKP/2 ->6
3 <$> const(PV "x") s ->4
4 <$> const(IV 10000) s ->5
6 <0> padsv[$x:1,2] sRM*/LVINTRO ->7
8 <;> nextstate(main 2 -e:1) v:{ ->9
b <2> sassign vKS/2 ->c
9 <$> const(PV "") s ->a
a <0> padsv[$x:1,2] sRM* ->b
-e syntax OK

If possible, I would suggest to allow 'length ($x) = 10000' to be
something ending in

SvGROW ($x, 10000);

one single op!

--
H.Merijn Brand http://tux.nl Perl Monger http://amsterdam.pm.org/
using & porting perl 5.6.2, 5.8.x, 5.10.x, 5.11.x on HP-UX 10.20, 11.00,
11.11, 11.23, and 11.31, OpenSuSE 10.3, 11.0, and 11.1, AIX 5.2 and 5.3.
http://mirrors.develooper.com/hpux/ http://www.test-smoke.org/
http://qa.perl.org http://www.goldmark.org/jeff/stupid-disclaimers/


hv at crypt

Nov 5, 2009, 5:13 AM

Post #25 of 37 (640 views)
Permalink
Re: Eval is deadly slow [In reply to]

demerphq <demerphq [at] gmail> wrote:
:2009/11/4 Ben Morrow <ben [at] morrow>:
:> Quoth h.m.brand [at] xs4all ("H.Merijn Brand"):
:>>
:>> IIRC I once suggested to allow length () to do preallocation:
:>>
:>> $ perl -wle'length($a) = 8000'
:>> Can't modify length in scalar assignment at -e line 1, at EOF
:>> Execution of -e aborted due to compilation errors.
:>
:>    ~% perl -MDevel::Peek -e'my $x = ("x" x 10_000); $x = ""; Dump $x'
:>    SV = PV(0x810109c) at 0x8103b10
:>      REFCNT = 1
:>      FLAGS = (PADMY,POK,pPOK)
:>      PV = 0x8134004 ""\0
:>      CUR = 0
:>      LEN = 16380
:>    ~%
:>
:> so while lvalue length would be a little neater, it's not strictly
:> necessary.
:
:Also notice the preallocated space. It was asked for 10k it gave 1638.

An alternative approach:
~% perl -MDevel::Peek -e 'my $x=""; vec($x, 10000, 8)=0; $x=""; Dump $x'
SV = PV(0x8b13b00) at 0x8b12cdc
REFCNT = 2
FLAGS = (PADMY,POK,pPOK)
PV = 0x8b53c20 ""\0
CUR = 0
LEN = 10004
~%

Hmm, not sure why I get refcount 2. But I think this should be a more
efficient way to allocate the space then ("x" x 10_000).

Hugo

First page Previous page 1 2 Next page Last page  View All Perl porters RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.