Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Perl: porters

Building a shared libperl should be the default for 5.12

 

 

Perl porters RSS feed   Index | Next | Previous | View Threaded


Tim.Bunce at pobox

Nov 4, 2009, 5:01 AM

Post #1 of 19 (206 views)
Permalink
Building a shared libperl should be the default for 5.12

INSTALL says:

=head3 Building a shared Perl library

Currently, for most systems, the main perl executable is built by
linking the "perl library" libperl.a with perlmain.o, your static
extensions, and various extra libraries, such as -lm.

On systems that support dynamic loading, it may be possible to
replace libperl.a with a shared libperl.so. If you anticipate building
several different perl binaries (e.g. by embedding libperl into
different programs, or by using the optional compiler extension), then
you might wish to build a shared libperl.so so that all your binaries
can share the same library.

The disadvantages are that there may be a significant performance
penalty associated with the shared libperl.so, and that the overall
mechanism is still rather fragile with respect to different versions
and upgrades.

In terms of performance, on my test system (Solaris 2.5_x86) the perl
test suite took roughly 15% longer to run with the shared libperl.so.
[...]

You can elect to build a shared libperl by

sh Configure -Duseshrplib


I think it's strategically important for there to be a libperl shared
library available on all systems that support it.

One current example use-case is plperl embedded in PostgreSQL. That
doesn't get enabled when postgres is built unless there's a shared
libperl available.

One future use-case is that perl5 support in perl6, especially for
compiled extensions, may involve embedding the perl5 libperl into
whatever's executing perl6, e.g., parrot.

Building of a shared libperl isn't the default on all platforms that
support it. I suspect because of performance issue noted in INSTALL.

For example:
hints/darwin.sh:# useshrplib=true results in much slower startup times.
hints/darwin.sh:# 'false' is the default value. Configure -Duseshrplib to override.

It seems to me that -Duseshrplib conflates two things that should be
separate: a) the building and installing of a shared libperl, and
b) linking the perl executable against it instead of the libperl.a.

I'd like to see a separate -Dmakeshrplib (eg) option to enable building
and installing of a shared libperl. -Duseshrplib would then also imply
-Dmakeshrplib.

The -Dmakeshrplib option should be enabled by default where possible,
so the availability of a shared libperl becomes much more common.

There would be no change to how the perl executable itself is linked.

Tim.


nick at ccl4

Nov 4, 2009, 5:11 AM

Post #2 of 19 (198 views)
Permalink
Re: Building a shared libperl should be the default for 5.12 [In reply to]

On Wed, Nov 04, 2009 at 01:01:32PM +0000, Tim Bunce wrote:

> It seems to me that -Duseshrplib conflates two things that should be
> separate: a) the building and installing of a shared libperl, and
> b) linking the perl executable against it instead of the libperl.a.

Should we therefore compile all object code with -fPIC, with the same
object files linked statically to the perl executable, and used in the
shared library?

Or should we compile everything twice, once with -fPIC and once without,
using the -fPIC code for the shared library, and the normal code for the
executable?

The distinction between -fPIC and not matters on at least x86_64

(It used to matter on ARM, but they found a way to deal with relocations,
such that build systems that assumed that all world is a VAX 2.0 would
still work)

Nicholas Clark


h.m.brand at xs4all

Nov 4, 2009, 5:24 AM

Post #3 of 19 (197 views)
Permalink
Re: Building a shared libperl should be the default for 5.12 [In reply to]

On Wed, 4 Nov 2009 13:11:29 +0000, Nicholas Clark <nick[at]ccl4.org> wrote:

> On Wed, Nov 04, 2009 at 01:01:32PM +0000, Tim Bunce wrote:
>
> > It seems to me that -Duseshrplib conflates two things that should be
> > separate: a) the building and installing of a shared libperl, and
> > b) linking the perl executable against it instead of the libperl.a.
>
> Should we therefore compile all object code with -fPIC, with the same
> object files linked statically to the perl executable, and used in the
> shared library?
>
> Or should we compile everything twice, once with -fPIC and once without,
> using the -fPIC code for the shared library, and the normal code for the
> executable?
>
> The distinction between -fPIC and not matters on at least x86_64
>
> (It used to matter on ARM, but they found a way to deal with relocations,
> such that build systems that assumed that all world is a VAX 2.0 would
> still work)

/me is in favour of compiling everything with -fPIC (or +Z on HP-UX cc)
where possible

--
H.Merijn Brand http://tux.nl Perl Monger http://amsterdam.pm.org/
using & porting perl 5.6.2, 5.8.x, 5.10.x, 5.11.x on HP-UX 10.20, 11.00,
11.11, 11.23, and 11.31, OpenSuSE 10.3, 11.0, and 11.1, AIX 5.2 and 5.3.
http://mirrors.develooper.com/hpux/ http://www.test-smoke.org/
http://qa.perl.org http://www.goldmark.org/jeff/stupid-disclaimers/


Tim.Bunce at pobox

Nov 4, 2009, 7:21 AM

Post #4 of 19 (193 views)
Permalink
Re: Building a shared libperl should be the default for 5.12 [In reply to]

On Wed, Nov 04, 2009 at 01:11:29PM +0000, Nicholas Clark wrote:
> On Wed, Nov 04, 2009 at 01:01:32PM +0000, Tim Bunce wrote:
>
> > It seems to me that -Duseshrplib conflates two things that should be
> > separate: a) the building and installing of a shared libperl, and
> > b) linking the perl executable against it instead of the libperl.a.

After posting my earlier message that I realized that it might not be so
straight forward. If an app wants to embed perl using the shared libperl
then it'll naturally want to use extensions.

That raises the binary compatibility issue: are extensions built for
a non-shared perl compatible with the shared libperl?

> Should we therefore compile all object code with -fPIC, with the same
> object files linked statically to the perl executable, and used in the
> shared library?
>
> Or should we compile everything twice, once with -fPIC and once without,
> using the -fPIC code for the shared library, and the normal code for the
> executable?
>
> The distinction between -fPIC and not matters on at least x86_64

I don't know what the issues/trade-offs are well enough to answer that.

Does using -fPIC but then static linking cause a performance loss on
major platforms (or can the use of PIC code be 'undone' during the
static linking, perhaps with an extra ld option)?

Since compiling everything twice would work, I think the binary
compatibility of extensions is a greater issue at this stage.
Any thoughts on that?

Tim.


doughera at lafayette

Nov 4, 2009, 8:46 AM

Post #5 of 19 (190 views)
Permalink
Re: Building a shared libperl should be the default for 5.12 [In reply to]

On Wed, 4 Nov 2009, Tim Bunce wrote:

> On Wed, Nov 04, 2009 at 01:11:29PM +0000, Nicholas Clark wrote:
> > On Wed, Nov 04, 2009 at 01:01:32PM +0000, Tim Bunce wrote:
> >
> > > It seems to me that -Duseshrplib conflates two things that should be
> > > separate: a) the building and installing of a shared libperl, and
> > > b) linking the perl executable against it instead of the libperl.a.
>
> After posting my earlier message that I realized that it might not be so
> straight forward. If an app wants to embed perl using the shared libperl
> then it'll naturally want to use extensions.
>
> That raises the binary compatibility issue: are extensions built for
> a non-shared perl compatible with the shared libperl?

I can't think of any reason why they wouldn't be. The extensions are all
compiled with -fPIC (or equivalent) whether or not there's a shared
libperl. I can't think of any relevant flags that are changed in the
build process.

This does remind me, however, that the Config keys used for building with
an embedding perl *do* change depending on useshrplib. In particular,
linking the app would now require an appropriate -rpath flag, but perl
wouldn't have been built with that flag.

> Does using -fPIC but then static linking cause a performance loss on
> major platforms (or can the use of PIC code be 'undone' during the
> static linking, perhaps with an extra ld option)?

The benchmark data in INSTALL is mine; I'd estimate it's from early 1996.
Newer tests would certainly be welcome. Timing the test suite is no
longer suitable, since there are so many sleeps() and other things that
wait on I/O.

--
Andy Dougherty doughera[at]lafayette.edu


tony at develop-help

Nov 4, 2009, 4:01 PM

Post #6 of 19 (187 views)
Permalink
Re: Building a shared libperl should be the default for 5.12 [In reply to]

On Wed, Nov 04, 2009 at 01:11:29PM +0000, Nicholas Clark wrote:
> The distinction between -fPIC and not matters on at least x86_64

It makes a big difference on 32-bit x86 too, since it doesn't have any
EIP relative addressing modes.

Tony


nick at ccl4

Nov 9, 2009, 8:33 AM

Post #7 of 19 (176 views)
Permalink
Re: Building a shared libperl should be the default for 5.12 [In reply to]

On Thu, Nov 05, 2009 at 11:01:54AM +1100, Tony Cook wrote:
> On Wed, Nov 04, 2009 at 01:11:29PM +0000, Nicholas Clark wrote:
> > The distinction between -fPIC and not matters on at least x86_64
>
> It makes a big difference on 32-bit x86 too, since it doesn't have any
> EIP relative addressing modes.

Is that a performance difference, because -fPIC restricts the flexibility of
the code generator?

I only know the situation on x86_64 as an end user - presenting the dynamic
linker with code compiled without -fPIC causes it to bail.

On ARM, I know that the reason for failing to link wasn't (directly)
performance quality of code generation, but that the default branch targets
are only 24 bits, which isn't enough for the full possibilities of 32 bit
redirection required for position independent code. However I believe that
this was "fixed" in the linker, rather than by changing code generation to
always use a slower and larger approach that could do 32 bit redirection.
(Possibly by generating stub thunks that did this, if they were needed.
But I'm only guessing here)

Nicholas Clark


tony at develop-help

Nov 9, 2009, 1:57 PM

Post #8 of 19 (172 views)
Permalink
Re: Building a shared libperl should be the default for 5.12 [In reply to]

On Mon, Nov 09, 2009 at 04:33:06PM +0000, Nicholas Clark wrote:
> On Thu, Nov 05, 2009 at 11:01:54AM +1100, Tony Cook wrote:
> > On Wed, Nov 04, 2009 at 01:11:29PM +0000, Nicholas Clark wrote:
> > > The distinction between -fPIC and not matters on at least x86_64
> >
> > It makes a big difference on 32-bit x86 too, since it doesn't have any
> > EIP relative addressing modes.
>
> Is that a performance difference, because -fPIC restricts the flexibility of
> the code generator?

The size of the performance difference is obviously going to depend on
the code, but for a simple:


/* probably low-performance code from a -fPIC point of view */
int x;

extern int y;

int f(void) {
return x+y;
}

The non-PIC code is the trivial and obvious:

f:
pushl %ebp
movl y, %eax
movl %esp, %ebp
addl x, %eax
popl %ebp
ret

The PIC code is:

f:
call __i686.get_pc_thunk.cx
addl $_GLOBAL_OFFSET_TABLE_, %ecx
pushl %ebp
movl %esp, %ebp
popl %ebp
movl y[at]GOT(%ecx), %eax
movl x[at]GOT(%ecx), %edx
movl (%eax), %eax
addl (%edx), %eax
ret

(built with -O3, gcc 4.3.2 debian)

The linker doesn't complain if you build without -fPIC and build a
shared object.

Tony


Tim.Bunce at pobox

Nov 10, 2009, 4:21 PM

Post #9 of 19 (149 views)
Permalink
Re: Building a shared libperl should be the default for 5.12 [In reply to]

On Wed, Nov 04, 2009 at 11:46:56AM -0500, Andy Dougherty wrote:
>
> The benchmark data in INSTALL is mine; I'd estimate it's from early 1996.
> Newer tests would certainly be welcome. Timing the test suite is no
> longer suitable, since there are so many sleeps() and other things that
> wait on I/O.

Here are some initial results using perlbench on my 2GHz Intel Core Duo
MacBook Pro...

A) 5.11.1-longdouble-noshrplib-nothreads/bin/perl5.11.1
B) 5.11.1-longdouble-noshrplib-threads/bin/perl5.11.1
C) 5.11.1-longdouble-shrplib-threads/bin/perl5.11.1
D) 5.11.1-nolongdouble-noshrplib-nothreads/bin/perl5.11.1
E) 5.11.1-nolongdouble-noshrplib-threads/bin/perl5.11.1
F) 5.11.1-nolongdouble-shrplib-nothreads/bin/perl5.11.1

A B C D E F
--- --- --- --- --- ---
AVERAGE 100 106 107 100 107 101

I recall from previous use of perlbench that the noise level is approx +/-3.

The cost of threads is clear in A vs B (~6%) and D vs E (~7%).

The cost of shrplib is well within the noise in B vs C (~1%) and A vs F (~1%)

Good news! I should be able to rerun the test on a linux box soonish.

Tim.

A B C D E F
--- --- --- --- --- ---
arith/mixed 100 104 107 103 110 101
arith/trig 100 100 104 110 117 109
array/copy 100 99 99 101 99 100
array/foreach 100 113 114 100 117 102
array/index 100 114 111 103 112 103
array/pop 100 104 105 98 103 101
array/shift 100 103 104 100 101 100
array/sort-num 100 99 98 100 98 99
array/sort 100 99 99 101 98 100
call/0arg 100 119 119 104 115 105
call/1arg 100 115 109 101 112 101
call/2arg 100 112 113 96 114 101
call/9arg 100 108 110 99 109 99
call/empty 100 117 119 99 119 95
call/fib 100 116 116 105 113 106
call/method 100 112 113 103 114 102
call/wantarray 100 107 107 100 107 99
hash/copy 100 102 100 99 101 99
hash/each 100 106 106 98 105 101
hash/foreach-sort 100 102 102 99 104 100
hash/foreach 100 107 107 99 107 99
hash/get 100 111 111 101 109 101
hash/set 100 112 112 99 111 102
loop/for-c 100 117 117 92 110 100
loop/for-range-const 100 110 114 89 118 102
loop/for-range 100 106 112 91 118 101
loop/getline 100 101 102 98 97 98
loop/while-my 100 116 117 99 118 98
loop/while 100 130 130 114 130 117
re/const 100 102 101 98 101 101
re/w 100 101 100 100 100 101
startup/fewmod 100 92 92 101 93 100
startup/lotsofsub 100 92 93 102 93 102
startup/noprog 100 97 90 101 95 93
string/base64 100 94 94 100 97 98
string/htmlparser 100 98 99 100 99 101
string/index-const 100 106 107 97 105 98
string/index-var 100 104 103 99 103 99
string/ipol 100 108 110 100 101 100
string/tr 100 102 101 101 101 102


h.m.brand at xs4all

Nov 10, 2009, 11:33 PM

Post #10 of 19 (147 views)
Permalink
Re: Building a shared libperl should be the default for 5.12 [In reply to]

On Wed, 11 Nov 2009 00:21:04 +0000, Tim Bunce <Tim.Bunce[at]pobox.com>
wrote:

> On Wed, Nov 04, 2009 at 11:46:56AM -0500, Andy Dougherty wrote:
> >
> > The benchmark data in INSTALL is mine; I'd estimate it's from early 1996.
> > Newer tests would certainly be welcome. Timing the test suite is no
> > longer suitable, since there are so many sleeps() and other things that
> > wait on I/O.
>
> Here are some initial results using perlbench on my 2GHz Intel Core Duo
> MacBook Pro...
>
> A) 5.11.1-longdouble-noshrplib-nothreads/bin/perl5.11.1
> B) 5.11.1-longdouble-noshrplib-threads/bin/perl5.11.1
> C) 5.11.1-longdouble-shrplib-threads/bin/perl5.11.1
> D) 5.11.1-nolongdouble-noshrplib-nothreads/bin/perl5.11.1
> E) 5.11.1-nolongdouble-noshrplib-threads/bin/perl5.11.1
> F) 5.11.1-nolongdouble-shrplib-nothreads/bin/perl5.11.1
>
> A B C D E F
> --- --- --- --- --- ---
> AVERAGE 100 106 107 100 107 101
>
> I recall from previous use of perlbench that the noise level is approx +/-3.
>
> The cost of threads is clear in A vs B (~6%) and D vs E (~7%).
>
> The cost of shrplib is well within the noise in B vs C (~1%) and A vs F (~1%)
>
> Good news! I should be able to rerun the test on a linux box soonish.

I'd be very interested in seeing the difference on non-multi-CPU,
non-intel architectures, like PA-RISC, sparc, or powerpc

last time /I/ benched, the diff was about 32% on both PA-RISC and
PowerPC, but, as with Andy, that was years ago

> Tim.
>
> A B C D E F
> --- --- --- --- --- ---
> arith/mixed 100 104 107 103 110 101
> arith/trig 100 100 104 110 117 109
> array/copy 100 99 99 101 99 100
> array/foreach 100 113 114 100 117 102
> array/index 100 114 111 103 112 103
> array/pop 100 104 105 98 103 101
> array/shift 100 103 104 100 101 100
> array/sort-num 100 99 98 100 98 99
> array/sort 100 99 99 101 98 100
> call/0arg 100 119 119 104 115 105
> call/1arg 100 115 109 101 112 101
> call/2arg 100 112 113 96 114 101
> call/9arg 100 108 110 99 109 99
> call/empty 100 117 119 99 119 95
> call/fib 100 116 116 105 113 106
> call/method 100 112 113 103 114 102
> call/wantarray 100 107 107 100 107 99
> hash/copy 100 102 100 99 101 99
> hash/each 100 106 106 98 105 101
> hash/foreach-sort 100 102 102 99 104 100
> hash/foreach 100 107 107 99 107 99
> hash/get 100 111 111 101 109 101
> hash/set 100 112 112 99 111 102
> loop/for-c 100 117 117 92 110 100
> loop/for-range-const 100 110 114 89 118 102
> loop/for-range 100 106 112 91 118 101
> loop/getline 100 101 102 98 97 98
> loop/while-my 100 116 117 99 118 98
> loop/while 100 130 130 114 130 117
> re/const 100 102 101 98 101 101
> re/w 100 101 100 100 100 101
> startup/fewmod 100 92 92 101 93 100
> startup/lotsofsub 100 92 93 102 93 102
> startup/noprog 100 97 90 101 95 93
> string/base64 100 94 94 100 97 98
> string/htmlparser 100 98 99 100 99 101
> string/index-const 100 106 107 97 105 98
> string/index-var 100 104 103 99 103 99
> string/ipol 100 108 110 100 101 100
> string/tr 100 102 101 101 101 102

--
H.Merijn Brand http://tux.nl Perl Monger http://amsterdam.pm.org/
using & porting perl 5.6.2, 5.8.x, 5.10.x, 5.11.x on HP-UX 10.20, 11.00,
11.11, 11.23, and 11.31, OpenSuSE 10.3, 11.0, and 11.1, AIX 5.2 and 5.3.
http://mirrors.develooper.com/hpux/ http://www.test-smoke.org/
http://qa.perl.org http://www.goldmark.org/jeff/stupid-disclaimers/


Tim.Bunce at pobox

Nov 11, 2009, 1:25 AM

Post #11 of 19 (144 views)
Permalink
Re: Building a shared libperl should be the default for 5.12 [In reply to]

On Wed, Nov 11, 2009 at 08:33:14AM +0100, H.Merijn Brand wrote:
> On Wed, 11 Nov 2009 00:21:04 +0000, Tim Bunce <Tim.Bunce[at]pobox.com>
> wrote:
>
> > A) 5.11.1-longdouble-noshrplib-nothreads/bin/perl5.11.1
> > B) 5.11.1-longdouble-noshrplib-threads/bin/perl5.11.1
> > C) 5.11.1-longdouble-shrplib-threads/bin/perl5.11.1
> > D) 5.11.1-nolongdouble-noshrplib-nothreads/bin/perl5.11.1
> > E) 5.11.1-nolongdouble-noshrplib-threads/bin/perl5.11.1
> > F) 5.11.1-nolongdouble-shrplib-nothreads/bin/perl5.11.1
> >
> > A B C D E F
> > --- --- --- --- --- ---
> > AVERAGE 100 106 107 100 107 101
> >
> > I recall from previous use of perlbench that the noise level is approx +/-3.
> > The cost of threads is clear in A vs B (~6%) and D vs E (~7%).
> > The cost of shrplib is well within the noise in B vs C (~1%) and A vs F (~1%)
> > Good news! I should be able to rerun the test on a linux box soonish.
>
> I'd be very interested in seeing the difference on non-multi-CPU,
> non-intel architectures, like PA-RISC, sparc, or powerpc
>
> last time /I/ benched, the diff was about 32% on both PA-RISC and
> PowerPC, but, as with Andy, that was years ago

I don't have access to those, but you could use the script I posted
recently on the "Test::Smoke and perlivp" thread to build the variants
yourself.

Tim.


h.m.brand at xs4all

Nov 11, 2009, 2:02 AM

Post #12 of 19 (143 views)
Permalink
Re: Building a shared libperl should be the default for 5.12 [In reply to]

On Wed, 11 Nov 2009 09:25:59 +0000, Tim Bunce <Tim.Bunce[at]pobox.com>
wrote:

> On Wed, Nov 11, 2009 at 08:33:14AM +0100, H.Merijn Brand wrote:
> > On Wed, 11 Nov 2009 00:21:04 +0000, Tim Bunce <Tim.Bunce[at]pobox.com>
> > wrote:
> >
> > > A) 5.11.1-longdouble-noshrplib-nothreads/bin/perl5.11.1
> > > B) 5.11.1-longdouble-noshrplib-threads/bin/perl5.11.1
> > > C) 5.11.1-longdouble-shrplib-threads/bin/perl5.11.1
> > > D) 5.11.1-nolongdouble-noshrplib-nothreads/bin/perl5.11.1
> > > E) 5.11.1-nolongdouble-noshrplib-threads/bin/perl5.11.1
> > > F) 5.11.1-nolongdouble-shrplib-nothreads/bin/perl5.11.1
> > >
> > > A B C D E F
> > > --- --- --- --- --- ---
> > > AVERAGE 100 106 107 100 107 101
> > >
> > > I recall from previous use of perlbench that the noise level is approx +/-3.
> > > The cost of threads is clear in A vs B (~6%) and D vs E (~7%).
> > > The cost of shrplib is well within the noise in B vs C (~1%) and A vs F (~1%)
> > > Good news! I should be able to rerun the test on a linux box soonish.
> >
> > I'd be very interested in seeing the difference on non-multi-CPU,
> > non-intel architectures, like PA-RISC, sparc, or powerpc
> >
> > last time /I/ benched, the diff was about 32% on both PA-RISC and
> > PowerPC, but, as with Andy, that was years ago
>
> I don't have access to those, but you could use the script I posted
> recently on the "Test::Smoke and perlivp" thread to build the variants
> yourself.

Seen them, read them, and made Abeltje aware of them (the Test::Smoke
maintainer).

I have no resources to be able to actually install the builds I smoke.
I've proposed a patch for Test::Smoke, so I can revive my speed diff
reports again, which are purely based on the tail lines of tests:

Files=1774, Tests=262615, 196 wallclock secs (36.78 usr 9.82 sys + 344.28 cusr 42.38 csys = 433.26 CPU)

which would result in a relative CPU speed of 262615/433.26 per test.
That is still based on the complete test suite, and thus not really
fair as thread tests are skipped (and thus also not counted) for
unthreaded perl builds.

But it will show a good indication of performance diffs between the
different configurations.

In the early days I included those speed diffs in my smoke mails, but
that got a bit out of hand (higher is better):

Automated smoke report for patch 17906 cc gcc
| HP-UX 11.00 B.11.11.04 3.2 32-bit
O = OK | 3.2 64-bit +GNUld
F = Failure(s), extended report at the bottom | HP-UX 10.20 A.10.32.30 3.2
? = still running or test results not (yet) available | AIX 4.3.3.0 vac 5.0.2.5 3.1.1
Build failures during: - = unknown, = skipped | AIX 4.2.1.0 xlc 3.1.4.10 3.1.1
c = Configure, m = make, t = make test-prep | Cygwin 1.3.12 3.2-1

HP-UX HP-UX HP-UX HP-UX AIX AIX AIX
11.00 11.00 10.20 10.20 4.3.3 4.3.3 4.2.1
HPc gcc HPc gcc vac gcc xlc
17906 17906 17906 17906 17906 17906 17906 Configuration
--------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------------------------------------------------------------
100 99 97 95 | 75 75 75 75 | 63 61 56 55 | 52 52 53 53 | 52 52 45 45 | 45 46 42 42 | 37 36 31 31 |
89 89 79 79 | 79 79 79 79 | | | 51 51 44 44 | 47 46 42 41 | |-Duse64bitint
90 89 80 79 | | | | | | |-Duse64bitall
| | | | 51 51 44 44 | 47 46 42 42 | 37 37 31 31 |-Duselongdouble
| | | | 49 49 44 43 | | |-Dusemorebits
| | | | | | |-Duse64bitall -Duselongdouble
68 67 57 60 | 54 55 54 53 | 34 34 29 30 | 28 28 27 28 | 41 41 36 36 | 39 39 | 24 24 20 21 |-Dusethreads -Duseithreads
61 58 53 54 | 52 51 52 53 | | | 39 40 36 36 | | |-Duse64bitint -Dusethreads -Duseithreads
61 61 53 53 | | | | | | |-Duse64bitall -Dusethreads -Duseithreads
| | | | 40 40 36 36 | | 24 24 20 20 |-Duselongdouble -Dusethreads -Duseithreads
| | | | 39 39 35 35 | | |-Dusemorebits -Dusethreads -Duseithreads
| | | | | | |-Duse64bitall -Duselongdouble -Dusethreads -Duseithreads
| | | |
| +- PERLIO = perlio | +- PERLIO = perlio -DDEBUGGING
+----- PERLIO = stdio +----- PERLIO = stdio -DDEBUGGING


--
H.Merijn Brand http://tux.nl Perl Monger http://amsterdam.pm.org/
using & porting perl 5.6.2, 5.8.x, 5.10.x, 5.11.x on HP-UX 10.20, 11.00,
11.11, 11.23, and 11.31, OpenSuSE 10.3, 11.0, and 11.1, AIX 5.2 and 5.3.
http://mirrors.develooper.com/hpux/ http://www.test-smoke.org/
http://qa.perl.org http://www.goldmark.org/jeff/stupid-disclaimers/


craig.a.berry at gmail

Nov 11, 2009, 8:19 AM

Post #13 of 19 (141 views)
Permalink
Re: Building a shared libperl should be the default for 5.12 [In reply to]

On Tue, Nov 10, 2009 at 6:21 PM, Tim Bunce <Tim.Bunce[at]pobox.com> wrote:

> The cost of shrplib is well within the noise in B vs C (~1%) and A vs F (~1%)

I'm puzzled by the absence in this discussion of any mention of what
gets done with the shared library after it's built. On systems I'm
more familiar with, that's where anyone concerned with the performance
of shared libraries focuses their attention. On Windows you register
DLLs. On VMS, you install images as known and/or shared. These give
you various degrees and types of pre-loading, and can reduce
filesystem look-up time and shoveling code from disk into memory time.
On VMS with the shared option, you reduce overall memory consumption
and increase cache hit rates by having every process that references
code in the library go after the same pages in memory.

I know I'm ignorant, but what am I missing? Does all of this happen
automagically on Unix?


h.m.brand at xs4all

Nov 11, 2009, 9:16 AM

Post #14 of 19 (136 views)
Permalink
Re: Building a shared libperl should be the default for 5.12 [In reply to]

On Wed, 11 Nov 2009 00:21:04 +0000, Tim Bunce <Tim.Bunce[at]pobox.com>
wrote:

> On Wed, Nov 04, 2009 at 11:46:56AM -0500, Andy Dougherty wrote:
> >
> > The benchmark data in INSTALL is mine; I'd estimate it's from early 1996.
> > Newer tests would certainly be welcome. Timing the test suite is no
> > longer suitable, since there are so many sleeps() and other things that
> > wait on I/O.
>
> Here are some initial results using perlbench on my 2GHz Intel Core Duo
> MacBook Pro...
>
> A) 5.11.1-longdouble-noshrplib-nothreads/bin/perl5.11.1
> B) 5.11.1-longdouble-noshrplib-threads/bin/perl5.11.1
> C) 5.11.1-longdouble-shrplib-threads/bin/perl5.11.1
> D) 5.11.1-nolongdouble-noshrplib-nothreads/bin/perl5.11.1
> E) 5.11.1-nolongdouble-noshrplib-threads/bin/perl5.11.1
> F) 5.11.1-nolongdouble-shrplib-nothreads/bin/perl5.11.1

The numbers are not (yet) available to all my smokes, but they will
enter as new smokes finish. On my quad-core Q9450 I see an average
performance drop of 10% when building with -Duseithreads (for both
-Duselongdouble or without). See http://doc.procura.nl/smoke/index.html
(wide screens are advised for easy viewing).

I did not try with shrplib/noshrplib.

The average performance drop fro -DDEBUGGING is worse: 16%

> A B C D E F
> --- --- --- --- --- ---
> AVERAGE 100 106 107 100 107 101
>
> I recall from previous use of perlbench that the noise level is approx +/-3.
>
> The cost of threads is clear in A vs B (~6%) and D vs E (~7%).
>
> The cost of shrplib is well within the noise in B vs C (~1%) and A vs F (~1%)
>
> Good news! I should be able to rerun the test on a linux box soonish.

--
H.Merijn Brand http://tux.nl Perl Monger http://amsterdam.pm.org/
using & porting perl 5.6.2, 5.8.x, 5.10.x, 5.11.x on HP-UX 10.20, 11.00,
11.11, 11.23, and 11.31, OpenSuSE 10.3, 11.0, and 11.1, AIX 5.2 and 5.3.
http://mirrors.develooper.com/hpux/ http://www.test-smoke.org/
http://qa.perl.org http://www.goldmark.org/jeff/stupid-disclaimers/


h.m.brand at xs4all

Nov 12, 2009, 4:32 AM

Post #15 of 19 (131 views)
Permalink
Re: Building a shared libperl should be the default for 5.12 [In reply to]

On Wed, 11 Nov 2009 09:25:59 +0000, Tim Bunce <Tim.Bunce[at]pobox.com>
wrote:

> On Wed, Nov 11, 2009 at 08:33:14AM +0100, H.Merijn Brand wrote:
> > On Wed, 11 Nov 2009 00:21:04 +0000, Tim Bunce <Tim.Bunce[at]pobox.com>
> > wrote:
> >
> > > A) 5.11.1-longdouble-noshrplib-nothreads/bin/perl5.11.1
> > > B) 5.11.1-longdouble-noshrplib-threads/bin/perl5.11.1
> > > C) 5.11.1-longdouble-shrplib-threads/bin/perl5.11.1
> > > D) 5.11.1-nolongdouble-noshrplib-nothreads/bin/perl5.11.1
> > > E) 5.11.1-nolongdouble-noshrplib-threads/bin/perl5.11.1
> > > F) 5.11.1-nolongdouble-shrplib-nothreads/bin/perl5.11.1

Doing a *fair* average over my available number, the summary is

threaded is 3.1% slower than non-threaded
DEBUGGING is 7.9% slower than non-DEBUGGING
gcc/g++ is 42.0% slower than cc
stdio is 4.4% slower than perlio

Where FAIR means that I only use statistics is the number of
measurements is the same for A and B on the same architecture
considering that the other parameters are equal. Those numbers might
change as more measurements come available. For above data, the
number of measurements taken to calculate the average was between 32
and 104, which sounds representative enough.

> > > A B C D E F
> > > --- --- --- --- --- ---
> > > AVERAGE 100 106 107 100 107 101
> > >
> > > I recall from previous use of perlbench that the noise level is approx +/-3.
> > > The cost of threads is clear in A vs B (~6%) and D vs E (~7%).
> > > The cost of shrplib is well within the noise in B vs C (~1%) and A vs F (~1%)
> > > Good news! I should be able to rerun the test on a linux box soonish.
> >
> > I'd be very interested in seeing the difference on non-multi-CPU,
> > non-intel architectures, like PA-RISC, sparc, or powerpc
> >
> > last time /I/ benched, the diff was about 32% on both PA-RISC and
> > PowerPC, but, as with Andy, that was years ago
>
> I don't have access to those, but you could use the script I posted
> recently on the "Test::Smoke and perlivp" thread to build the variants
> yourself.

--
H.Merijn Brand http://tux.nl Perl Monger http://amsterdam.pm.org/
using & porting perl 5.6.2, 5.8.x, 5.10.x, 5.11.x on HP-UX 10.20, 11.00,
11.11, 11.23, and 11.31, OpenSuSE 10.3, 11.0, and 11.1, AIX 5.2 and 5.3.
http://mirrors.develooper.com/hpux/ http://www.test-smoke.org/
http://qa.perl.org http://www.goldmark.org/jeff/stupid-disclaimers/


doughera at lafayette

Nov 12, 2009, 8:18 AM

Post #16 of 19 (132 views)
Permalink
Re: Building a shared libperl should be the default for 5.12 [In reply to]

On Wed, 11 Nov 2009, Tim Bunce wrote:

> On Wed, Nov 04, 2009 at 11:46:56AM -0500, Andy Dougherty wrote:
> >
> > The benchmark data in INSTALL is mine; I'd estimate it's from early 1996.
> > Newer tests would certainly be welcome. Timing the test suite is no
> > longer suitable, since there are so many sleeps() and other things that
> > wait on I/O.
>
> Here are some initial results using perlbench on my 2GHz Intel Core Duo
> MacBook Pro...
>
> A) 5.11.1-longdouble-noshrplib-nothreads/bin/perl5.11.1
> B) 5.11.1-longdouble-noshrplib-threads/bin/perl5.11.1
> C) 5.11.1-longdouble-shrplib-threads/bin/perl5.11.1
> D) 5.11.1-nolongdouble-noshrplib-nothreads/bin/perl5.11.1
> E) 5.11.1-nolongdouble-noshrplib-threads/bin/perl5.11.1
> F) 5.11.1-nolongdouble-shrplib-nothreads/bin/perl5.11.1
>
> A B C D E F
> --- --- --- --- --- ---
> AVERAGE 100 106 107 100 107 101
>
> I recall from previous use of perlbench that the noise level is approx +/-3.
>
> The cost of threads is clear in A vs B (~6%) and D vs E (~7%).
>
> The cost of shrplib is well within the noise in B vs C (~1%) and A vs F (~1%)

I've always been skeptical about perlbench for these measurements --
doesn't it try hard to normalize out some of the basic looping overhead?
Does that normalization remove any "interesting" information? (It's been a
very long time since I looked at perlbench innards, so my recollections
could be very out of date.)

What do you get if you run each of those perl versions on a "typical"
program?

For example, I tried running spamassassin on a collection of recent mail,
and it took 137s with -Uuseshrplib, and 167s with -Duseshrplib, or a 22%
penalty! This was with two completely separate standalone spamasssassin
installations, not running the spamd daemon or anything. (Threads came
somewhere in between, at 152 seconds.)

The system was a plain Debian/Linux/x86. It is rather heavily loaded (but
then it's always heavily loaded). Each time was averaged over 5 trials,
and the standard deviations were under one second. This mimics my typical
usage of perl.

--
Andy Dougherty doughera[at]lafayette.edu


nick at ccl4

Nov 16, 2009, 5:09 AM

Post #17 of 19 (86 views)
Permalink
Re: Building a shared libperl should be the default for 5.12 [In reply to]

On Thu, Nov 12, 2009 at 11:18:54AM -0500, Andy Dougherty wrote:

> I've always been skeptical about perlbench for these measurements --
> doesn't it try hard to normalize out some of the basic looping overhead?
> Does that normalization remove any "interesting" information? (It's been a
> very long time since I looked at perlbench innards, so my recollections
> could be very out of date.)
>
> What do you get if you run each of those perl versions on a "typical"
> program?
>
> For example, I tried running spamassassin on a collection of recent mail,
> and it took 137s with -Uuseshrplib, and 167s with -Duseshrplib, or a 22%
> penalty! This was with two completely separate standalone spamasssassin
> installations, not running the spamd daemon or anything. (Threads came
> somewhere in between, at 152 seconds.)
>
> The system was a plain Debian/Linux/x86. It is rather heavily loaded (but
> then it's always heavily loaded). Each time was averaged over 5 trials,
> and the standard deviations were under one second. This mimics my typical
> usage of perl.

Not as statistically "robust" as yours, but on my mostly unloaded x86 desktop
at work:

Running Mail::SpamAssassin's regression tests, for a blead freshly built, -Os

Without -Duseshrplib

Files=143, Tests=2027, 413 wallclock secs ( 0.75 usr 0.11 sys + 176.51 cusr 7.63 csys = 185.00 CPU)
Files=143, Tests=2027, 402 wallclock secs ( 0.71 usr 0.12 sys + 176.56 cusr 8.21 csys = 185.60 CPU)
Files=143, Tests=2027, 408 wallclock secs ( 0.76 usr 0.12 sys + 176.69 cusr 7.62 csys = 185.19 CPU)

With -Duseshrplib

Files=143, Tests=2027, 491 wallclock secs ( 0.82 usr 0.10 sys + 213.58 cusr 7.92 csys = 222.42 CPU)
Files=143, Tests=2027, 490 wallclock secs ( 0.88 usr 0.07 sys + 213.59 cusr 7.70 csys = 222.24 CPU)
Files=143, Tests=2027, 488 wallclock secs ( 0.84 usr 0.13 sys + 213.04 cusr 7.70 csys = 221.71 CPU)


So about 17% more CPU time needed.

That's not a good thing to turn on by default.

We'd really need a way to build a shared perl library with shared flags, and
a regular perl without.

Nicholas Clark


h.m.brand at xs4all

Nov 16, 2009, 5:23 AM

Post #18 of 19 (86 views)
Permalink
Re: Building a shared libperl should be the default for 5.12 [In reply to]

On Mon, 16 Nov 2009 13:09:50 +0000, Nicholas Clark <nick[at]ccl4.org>
wrote:

> On Thu, Nov 12, 2009 at 11:18:54AM -0500, Andy Dougherty wrote:
>
> > I've always been skeptical about perlbench for these measurements --
> > doesn't it try hard to normalize out some of the basic looping overhead?
> > Does that normalization remove any "interesting" information? (It's been a
> > very long time since I looked at perlbench innards, so my recollections
> > could be very out of date.)
> >
> > What do you get if you run each of those perl versions on a "typical"
> > program?
> >
> > For example, I tried running spamassassin on a collection of recent mail,
> > and it took 137s with -Uuseshrplib, and 167s with -Duseshrplib, or a 22%
> > penalty! This was with two completely separate standalone spamasssassin
> > installations, not running the spamd daemon or anything. (Threads came
> > somewhere in between, at 152 seconds.)
> >
> > The system was a plain Debian/Linux/x86. It is rather heavily loaded (but
> > then it's always heavily loaded). Each time was averaged over 5 trials,
> > and the standard deviations were under one second. This mimics my typical
> > usage of perl.
>
> Not as statistically "robust" as yours, but on my mostly unloaded x86 desktop
> at work:
>
> Running Mail::SpamAssassin's regression tests, for a blead freshly built, -Os
>
> Without -Duseshrplib
>
> Files=143, Tests=2027, 413 wallclock secs ( 0.75 usr 0.11 sys + 176.51 cusr 7.63 csys = 185.00 CPU)
> Files=143, Tests=2027, 402 wallclock secs ( 0.71 usr 0.12 sys + 176.56 cusr 8.21 csys = 185.60 CPU)
> Files=143, Tests=2027, 408 wallclock secs ( 0.76 usr 0.12 sys + 176.69 cusr 7.62 csys = 185.19 CPU)
>
> With -Duseshrplib
>
> Files=143, Tests=2027, 491 wallclock secs ( 0.82 usr 0.10 sys + 213.58 cusr 7.92 csys = 222.42 CPU)
> Files=143, Tests=2027, 490 wallclock secs ( 0.88 usr 0.07 sys + 213.59 cusr 7.70 csys = 222.24 CPU)
> Files=143, Tests=2027, 488 wallclock secs ( 0.84 usr 0.13 sys + 213.04 cusr 7.70 csys = 221.71 CPU)
>
>
> So about 17% more CPU time needed.
>
> That's not a good thing to turn on by default.
>
> We'd really need a way to build a shared perl library with shared flags, and
> a regular perl without.

I'm not so sure about that. If we do, I want to make absolutely sure
that modules that depends on shared things (Tk, DBI) and modules that
depend on those and/or shared libraries (DBD::Oracle) still function on
a perl that was built without shared flags (on Linux, HP-UX (PA and
IPF), AIX, VMS, windows and Sparc.

--
H.Merijn Brand http://tux.nl Perl Monger http://amsterdam.pm.org/
using & porting perl 5.6.2, 5.8.x, 5.10.x, 5.11.x on HP-UX 10.20, 11.00,
11.11, 11.23, and 11.31, OpenSuSE 10.3, 11.0, and 11.1, AIX 5.2 and 5.3.
http://mirrors.develooper.com/hpux/ http://www.test-smoke.org/
http://qa.perl.org http://www.goldmark.org/jeff/stupid-disclaimers/


doughera at lafayette

Nov 16, 2009, 5:46 AM

Post #19 of 19 (86 views)
Permalink
Re: Building a shared libperl should be the default for 5.12 [In reply to]

On Mon, 16 Nov 2009, Nicholas Clark wrote:

> So about 17% more CPU time needed.
>
> That's not a good thing to turn on by default.
>
> We'd really need a way to build a shared perl library with shared flags, and
> a regular perl without.

Right. And I think that's what Tim was ultimately advocating.
(Just to be sure, I have re-run some tests to verify that it's the -fpic,
not the shared lib loading, that slows things down.)

This is, then, a build & Configure problem, along with some documentation
fixes. Someone needs to think very carefully through all the linking
flags and options for how to piece this all together so that it just
works, and is documented appropriately. Alas, if there's any real chance
of doing this before 5.12, that someone will have to be someone other than
me.

--
Andy Dougherty doughera[at]lafayette.edu

Perl porters RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact lists@gossamer-threads.com
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.