Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Perl: porters

Speeding up mktables; NYTprof

 

 

Perl porters RSS feed   Index | Next | Previous | View Threaded


public at khwilliamson

Nov 25, 2009, 9:33 PM

Post #1 of 10 (486 views)
Permalink
Speeding up mktables; NYTprof

Nicholas Clark wrote:
>> [...]
>
> Also, you mentioned Encode and the files it generated. I found a lot of scope
> for optimisation within enc2xs, which dramatically decreased its memory use
> and run time, without needing large fundamental design changes to how it
> worked. And I managed that with only Devel::DProf. I'm curious what can be
> achieved with Devel::NYTProf on mktables. (Which is a task that is within
> the skill set of any of the 600 subscribers to this list. I'm hoping that it
> might appeal to at least one.)

So, I tried it with NYTProf. As I expected (I had used DProf earlier),
the highest usage subroutine was my pure Perl version of
Scalar::Util::refaddr, reproduced below. A third of the total time was
spent in this routine. (This is required because miniperl doesn't do
dynamic loading, so refaddr is not available.)

When I was writing mktables, I was under the impression that refaddr
would be brought into the core for 5.12. There was an agreement to that
effect, but I guess no one ever got around to actually doing it.

When I run this under perl instead of miniperl, and change objaddr to
just return refaddr, the combination still takes quite a lot of time.
If refaddr were in the core would it be in-lined?

I found a few surprises; I haven't pored over the results, though. One
is that I left in a trace statement that got to the trace subroutine
before discovering that it had nothing to do. This added not very much
time.

Based on looking at existing code in utf8_heavy.pl, I had presumed that
the Perl optimizer would remove code that depended on a constant
subroutine that returns false. That is, 'foo if DEBUG' would be
optimized away if there was a line: 'sub Debug { 0 }' But that appears
to not be the case.

There were more string evals than I expected, though the total time did
not add up to all that much. I couldn't find a way in NYTProf to
highlight those.

There are two columns in the nytprofhtml output for subroutines that I
can't figure out what they mean, and saw no documentation for, 'P' and 'F'.

I also did not see anything there for memory usage. I don't know how
Perl handles using up too much memory. The old mktables kept all its
tables in memory, and so I felt free to do so as well. But the new
mktables handles quite a few more tables than the old one. I would
think you would get thrashing if the memory usage got too big. I wonder
if Steve's machine doesn't have much memory. When I run mktables using
perl instead of miniperl, I get execution times between 30 and 40 seconds.


public at khwilliamson

Nov 25, 2009, 9:42 PM

Post #2 of 10 (468 views)
Permalink
Re: Speeding up mktables; NYTprof [In reply to]

Oops, forgot to put the function in.
karl williamson wrote:
> Nicholas Clark wrote:
>>> [...]
>>
>> Also, you mentioned Encode and the files it generated. I found a lot
>> of scope
>> for optimisation within enc2xs, which dramatically decreased its
>> memory use
>> and run time, without needing large fundamental design changes to how it
>> worked. And I managed that with only Devel::DProf. I'm curious what
>> can be
>> achieved with Devel::NYTProf on mktables. (Which is a task that is within
>> the skill set of any of the 600 subscribers to this list. I'm hoping
>> that it
>> might appeal to at least one.)
>
> So, I tried it with NYTProf. As I expected (I had used DProf earlier),
> the highest usage subroutine was my pure Perl version of
> Scalar::Util::refaddr, reproduced below. A third of the total time was
> spent in this routine. (This is required because miniperl doesn't do
> dynamic loading, so refaddr is not available.)
>
> When I was writing mktables, I was under the impression that refaddr
> would be brought into the core for 5.12. There was an agreement to that
> effect, but I guess no one ever got around to actually doing it.
>
> When I run this under perl instead of miniperl, and change objaddr to
> just return refaddr, the combination still takes quite a lot of time. If
> refaddr were in the core would it be in-lined?
>
> I found a few surprises; I haven't pored over the results, though. One
> is that I left in a trace statement that got to the trace subroutine
> before discovering that it had nothing to do. This added not very much
> time.
>
> Based on looking at existing code in utf8_heavy.pl, I had presumed that
> the Perl optimizer would remove code that depended on a constant
> subroutine that returns false. That is, 'foo if DEBUG' would be
> optimized away if there was a line: 'sub Debug { 0 }' But that appears
> to not be the case.
>
> There were more string evals than I expected, though the total time did
> not add up to all that much. I couldn't find a way in NYTProf to
> highlight those.
>
> There are two columns in the nytprofhtml output for subroutines that I
> can't figure out what they mean, and saw no documentation for, 'P' and 'F'.
>
> I also did not see anything there for memory usage. I don't know how
> Perl handles using up too much memory. The old mktables kept all its
> tables in memory, and so I felt free to do so as well. But the new
> mktables handles quite a few more tables than the old one. I would
> think you would get thrashing if the memory usage got too big. I wonder
> if Steve's machine doesn't have much memory. When I run mktables using
> perl instead of miniperl, I get execution times between 30 and 40 seconds.
>
sub objaddr($) {
# Returns the address of the blessed input object. Uses the XS
version if
# available. It doesn't check for blessedness because that would do a
# string eval every call, and the program is structured so that this is
# never called for a non-blessed object.

return Scalar::Util::refaddr($_[0]) if $has_fast_scalar_util;

# Get the package
my $pkg = ref($_[0]) or return undef;

# Change to a fake package to defeat any overloading
bless $_[0], 'main::Fake';

# Numifying a ref gives its address.
my $addr = 0 + $_[0];

# Return to original class
bless $_[0], $pkg;
return $addr;
}

I found that any overload in a class caused the numifying to fail if I
did it in that class; hence the blesses are necessary.


pagaltzis at gmx

Nov 25, 2009, 9:57 PM

Post #3 of 10 (464 views)
Permalink
Re: Speeding up mktables; NYTprof [In reply to]

* karl williamson <public [at] khwilliamson> [2009-11-26 06:35]:
> As I expected (I had used DProf earlier), the highest usage
> subroutine was my pure Perl version of Scalar::Util::refaddr,
> reproduced below. A third of the total time was spent in this
> routine. (This is required because miniperl doesn't do dynamic
> loading, so refaddr is not available.)

There was consensus for putting the Scalar::Utils stuff in core
a while ago… I wonder if that’s still on anyone’s radar.

Regards,
--
Aristotle Pagaltzis // <http://plasmasturm.org/>


Eirik-Berg.Hanssen at allverden

Nov 25, 2009, 11:27 PM

Post #4 of 10 (465 views)
Permalink
Re: Speeding up mktables; NYTprof [In reply to]

karl williamson <public [at] khwilliamson> writes:

> Based on looking at existing code in utf8_heavy.pl, I had presumed
> that the Perl optimizer would remove code that depended on a constant
> subroutine that returns false. That is, 'foo if DEBUG' would be
> optimized away if there was a line: 'sub Debug { 0 }' But that
> appears to not be the case.

It needs a prototype: 'sub Debug () { 0 }'.

From perlsub:

# Functions with a prototype of "()" are potential candidates for
# inlining. If the result after optimization and constant folding is
# either a constant or a lexically‐scoped scalar which has no other
# references, then it will be used in place of function calls made
# without "&". Calls made using "&" are never inlined. (See
# constant.pm for an easy way to declare most constants.)


Eirik
--
All bridge hands are equally likely, but some are more equally likely
than others.
-- Alan Truscott


nick at ccl4

Nov 26, 2009, 7:43 AM

Post #5 of 10 (453 views)
Permalink
Re: Speeding up mktables; NYTprof [In reply to]

On Wed, Nov 25, 2009 at 10:42:59PM -0700, karl williamson wrote:
> Oops, forgot to put the function in.
> karl williamson wrote:

> >So, I tried it with NYTProf. As I expected (I had used DProf earlier),
> >the highest usage subroutine was my pure Perl version of
> >Scalar::Util::refaddr, reproduced below. A third of the total time was
> >spent in this routine. (This is required because miniperl doesn't do
> >dynamic loading, so refaddr is not available.)
> >
> >When I was writing mktables, I was under the impression that refaddr
> >would be brought into the core for 5.12. There was an agreement to that
> >effect, but I guess no one ever got around to actually doing it.
> >
> >When I run this under perl instead of miniperl, and change objaddr to
> >just return refaddr, the combination still takes quite a lot of time. If
> >refaddr were in the core would it be in-lined?

> sub objaddr($) {
> # Returns the address of the blessed input object. Uses the XS
> version if
> # available. It doesn't check for blessedness because that would do a
> # string eval every call, and the program is structured so that this is
> # never called for a non-blessed object.
>
> return Scalar::Util::refaddr($_[0]) if $has_fast_scalar_util;
>
> # Get the package
> my $pkg = ref($_[0]) or return undef;
>
> # Change to a fake package to defeat any overloading
> bless $_[0], 'main::Fake';
>
> # Numifying a ref gives its address.
> my $addr = 0 + $_[0];
>
> # Return to original class
> bless $_[0], $pkg;
> return $addr;
> }
>
> I found that any overload in a class caused the numifying to fail if I
> did it in that class; hence the blesses are necessary.

I have memory of comments being made on IRC about no overloading. Using it
avoids the need to rebless. With this patch:

diff --git a/lib/unicore/mktables b/lib/unicore/mktables
index ee51608..b2624f0 100644
--- a/lib/unicore/mktables
+++ b/lib/unicore/mktables
@@ -1133,17 +1133,11 @@ sub objaddr($) {
return Scalar::Util::refaddr($_[0]) if $has_fast_scalar_util;

# Check at least that is a ref.
- my $pkg = ref($_[0]) or return undef;
-
- # Change to a fake package to defeat any overloaded stringify
- bless $_[0], 'main::Fake';
+ ref($_[0]) or return undef;

# Numifying a ref gives its address.
- my $addr = 0 + $_[0];
-
- # Return to original class
- bless $_[0], $pkg;
- return $addr;
+ no overloading;
+ return 0 + $_[0];
}

sub max ($$) {

I find that on "my machine" the run time (with ./miniperl) goes down from 47 to
38 seconds. That's pretty close to the run time with ./perl (and hence
real Scalar::Util::refaddr) of 35.5 seconds.

Technically, I think, this falls foul of a strict reading of the feature
freeze, as it's not a bug fix. Unless "it's too slow" is considered a bug.

Nicholas Clark


public at khwilliamson

Nov 26, 2009, 7:46 AM

Post #6 of 10 (458 views)
Permalink
Re: Speeding up mktables; NYTprof [In reply to]

Eirik Berg Hanssen wrote:
> karl williamson <public [at] khwilliamson> writes:
>
>> Based on looking at existing code in utf8_heavy.pl, I had presumed
>> that the Perl optimizer would remove code that depended on a constant
>> subroutine that returns false. That is, 'foo if DEBUG' would be
>> optimized away if there was a line: 'sub Debug { 0 }' But that
>> appears to not be the case.
>
> It needs a prototype: 'sub Debug () { 0 }'.
>
> From perlsub:
>
> # Functions with a prototype of "()" are potential candidates for
> # inlining. If the result after optimization and constant folding is
> # either a constant or a lexically‐scoped scalar which has no other
> # references, then it will be used in place of function calls made
> # without "&". Calls made using "&" are never inlined. (See
> # constant.pm for an easy way to declare most constants.)
>
>
> Eirik
I looked and actually it is that:
sub DEBUG () { 0 } # Set to 0 for production; 1 for development

so it isn't getting optimized out, nonetheless.


Tim.Bunce at pobox

Nov 26, 2009, 9:30 AM

Post #7 of 10 (455 views)
Permalink
Re: Speeding up mktables; NYTprof [In reply to]

On Wed, Nov 25, 2009 at 10:33:29PM -0700, karl williamson wrote:
>
> There are two columns in the nytprofhtml output for subroutines that I
> can't figure out what they mean, and saw no documentation for, 'P' and 'F'.

They're the number of 'places' the sub was called from (distinct file +
line number) and the number of distinct files. Hovering over the P and
the F should show a tooltip saying something along those lines.

Tim.


public at khwilliamson

Nov 26, 2009, 9:36 AM

Post #8 of 10 (451 views)
Permalink
Re: Speeding up mktables; NYTprof [In reply to]

Nicholas Clark wrote:
> On Wed, Nov 25, 2009 at 10:42:59PM -0700, karl williamson wrote:
>> Oops, forgot to put the function in.
>> karl williamson wrote:
>
>>> So, I tried it with NYTProf. As I expected (I had used DProf earlier),
>>> the highest usage subroutine was my pure Perl version of
>>> Scalar::Util::refaddr, reproduced below. A third of the total time was
>>> spent in this routine. (This is required because miniperl doesn't do
>>> dynamic loading, so refaddr is not available.)
>>>
>>> When I was writing mktables, I was under the impression that refaddr
>>> would be brought into the core for 5.12. There was an agreement to that
>>> effect, but I guess no one ever got around to actually doing it.
>>>
>>> When I run this under perl instead of miniperl, and change objaddr to
>>> just return refaddr, the combination still takes quite a lot of time. If
>>> refaddr were in the core would it be in-lined?
>
>> sub objaddr($) {
>> # Returns the address of the blessed input object. Uses the XS
>> version if
>> # available. It doesn't check for blessedness because that would do a
>> # string eval every call, and the program is structured so that this is
>> # never called for a non-blessed object.
>>
>> return Scalar::Util::refaddr($_[0]) if $has_fast_scalar_util;
>>
>> # Get the package
>> my $pkg = ref($_[0]) or return undef;
>>
>> # Change to a fake package to defeat any overloading
>> bless $_[0], 'main::Fake';
>>
>> # Numifying a ref gives its address.
>> my $addr = 0 + $_[0];
>>
>> # Return to original class
>> bless $_[0], $pkg;
>> return $addr;
>> }
>>
>> I found that any overload in a class caused the numifying to fail if I
>> did it in that class; hence the blesses are necessary.
>
> I have memory of comments being made on IRC about no overloading. Using it
> avoids the need to rebless. With this patch:
>
> diff --git a/lib/unicore/mktables b/lib/unicore/mktables
> index ee51608..b2624f0 100644
> --- a/lib/unicore/mktables
> +++ b/lib/unicore/mktables
> @@ -1133,17 +1133,11 @@ sub objaddr($) {
> return Scalar::Util::refaddr($_[0]) if $has_fast_scalar_util;
>
> # Check at least that is a ref.
> - my $pkg = ref($_[0]) or return undef;
> -
> - # Change to a fake package to defeat any overloaded stringify
> - bless $_[0], 'main::Fake';
> + ref($_[0]) or return undef;
>
> # Numifying a ref gives its address.
> - my $addr = 0 + $_[0];
> -
> - # Return to original class
> - bless $_[0], $pkg;
> - return $addr;
> + no overloading;
> + return 0 + $_[0];
> }
>
> sub max ($$) {
>
> I find that on "my machine" the run time (with ./miniperl) goes down from 47 to
> 38 seconds. That's pretty close to the run time with ./perl (and hence
> real Scalar::Util::refaddr) of 35.5 seconds.

Many thanks
>
> Technically, I think, this falls foul of a strict reading of the feature
> freeze, as it's not a bug fix. Unless "it's too slow" is considered a bug.
>

Ah, but it isn't a new feature; I could argue that mktables isn't even a
new feature, but no point; it met the deadline anyway. I'll submit a
patch with your fix. I'm trying to keep mktables compatible with 5.8 in
case a user wants to apply it; no object recompilations are needed. 'no
overloading' is not a 5.8 feature, so I'll add something to redefine it
when that pragma is not available.

> Nicholas Clark
>


ikegami at adaelis

Nov 26, 2009, 10:18 AM

Post #9 of 10 (455 views)
Permalink
Re: Speeding up mktables; NYTprof [In reply to]

On Thu, Nov 26, 2009 at 10:46 AM, karl williamson
<public [at] khwilliamson>wrote:

> I looked and actually it is that:
> sub DEBUG () { 0 } # Set to 0 for production; 1 for development
>
> so it isn't getting optimized out, nonetheless.
>

$ perl -Ilib -MO=Concise,-exec -e'sub DEBUG () { 1 } foo() if DEBUG'
1 <0> enter
2 <;> nextstate(main 2 -e:1) v:{
3 <0> pushmark s
4 <#> gv[*foo] s/EARLYCV
5 <1> entersub[t2] vKS/TARG,1
6 <@> leave[1 ref] vKP/REFC
-e syntax OK

$ perl -Ilib -MO=Concise,-exec -e'sub DEBUG () { 0 } foo() if DEBUG'
1 <0> enter
2 <;> nextstate(main 2 -e:1) v:{
3 <@> leave[1 ref] vKP/REFC
-e syntax OK

In both cases, the "if" is opimised away. And in the case where DEBUG
returns 0, so is the function call.


jesse at fsck

Nov 29, 2009, 1:23 PM

Post #10 of 10 (414 views)
Permalink
Re: Speeding up mktables; NYTprof [In reply to]

On Thu, Nov 26, 2009 at 06:57:51AM +0100, Aristotle Pagaltzis wrote:
> * karl williamson <public [at] khwilliamson> [2009-11-26 06:35]:
> > As I expected (I had used DProf earlier), the highest usage
> > subroutine was my pure Perl version of Scalar::Util::refaddr,
> > reproduced below. A third of the total time was spent in this
> > routine. (This is required because miniperl doesn't do dynamic
> > loading, so refaddr is not available.)
>
> There was consensus for putting the Scalar::Utils stuff in core
> a while ago??? I wonder if that???s still on anyone???s radar.

There was, iirc, some debate over "the right way" to do it - and I never
saw a patch.

Perl porters RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.