Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Python: Dev

shared data (was: Some thoughts on the codecs...)

 

 

Python dev RSS feed   Index | Next | Previous | View Threaded


gstein at lyra

Nov 16, 1999, 5:09 AM

Post #1 of 11 (423 views)
Permalink
shared data (was: Some thoughts on the codecs...)

On Mon, 15 Nov 1999, Guido van Rossum wrote:
>...
> > The problem with these large tables is that currently
> > Python modules are not shared among processes since
> > every process builds its own table.
> >
> > Static C data has the advantage of being shareable at
> > the OS level.
>
> Don't worry about it. 128K is too small to care, I think...

This is the reason Python starts up so slow and has a large memory
footprint. There hasn't been any concern for moving stuff into shared data
pages. As a result, a process must map in a bunch of vmem pages, for no
other reason than to allocate Python structures in that memory and copy
constants in.

Go start Perl 100 times, then do the same with Python. Python is
significantly slower. I've actually written a web app in PHP because
another one that I did in Python had slow response time.
[. yah: the Real Man Answer is to write a real/good mod_python. ]

Cheers,
-g

--
Greg Stein, http://www.lyra.org/


gward at cnri

Nov 16, 1999, 7:10 AM

Post #2 of 11 (418 views)
Permalink
Re: shared data (was: Some thoughts on the codecs...) [In reply to]

On 16 November 1999, Greg Stein said:
> This is the reason Python starts up so slow and has a large memory
> footprint. There hasn't been any concern for moving stuff into shared data
> pages. As a result, a process must map in a bunch of vmem pages, for no
> other reason than to allocate Python structures in that memory and copy
> constants in.
>
> Go start Perl 100 times, then do the same with Python. Python is
> significantly slower. I've actually written a web app in PHP because
> another one that I did in Python had slow response time.
> [. yah: the Real Man Answer is to write a real/good mod_python. ]

I don't think this is the only factor in startup overhead. Try looking
into the number of system calls for the trivial startup case of each
interpreter:

$ truss perl -e 1 2> perl.log
$ truss python -c 1 2> python.log

(This is on Solaris; I did the same thing on Linux with "strace", and on
IRIX with "par -s -SS". Dunno about other Unices.) The results are
interesting, and useful despite the platform and version disparities.

(For the record: Python 1.5.2 on all three platforms; Perl 5.005_03 on
Solaris, 5.004_05 on Linux, and 5.004_04 on IRIX. The Solaris is 2.6,
using the Official CNRI Python Build by Barry, and the ditto Perl build
by me; the Linux system is starship, using whatever Perl and Python the
Starship Masters provide us with; the IRIX box is an elderly but
well-maintained SGI Challenge running IRIX 5.3.)

Also, this is with an empty PYTHONPATH. The Solaris build of Python has
different prefix and exec_prefix, but on the Linux and IRIX builds, they
are the same. (I think this will reflect poorly on the Solaris
version.) PERLLIB, PERL5LIB, and Perl's builtin @INC should not affect
startup of the trivial "1" script, so I haven't paid attention to them.

First, the size of log files (in lines), i.e. number of system calls:

Solaris Linux IRIX[1]
Perl 88 85 70
Python 425 316 257

[1] after chopping off the summary counts from the "par" output -- ie.
these really are the number of system calls, not the number of
lines in the log files

Next, the number of "open" calls:

Solaris Linux IRIX
Perl 16 10 9
Python 107 71 48

(It looks as though *all* of the Perl 'open' calls are due to the
dynamic linker going through /usr/lib and/or /lib.)

And the number of unsuccessful "open" calls:

Solaris Linux IRIX
Perl 6 1 3
Python 77 49 32

Number of "mmap" calls:

Solaris Linux IRIX
Perl 25 25 1
Python 36 24 1

...nope, guess we can't blame mmap for any Perl/Python startup
disparity.

How about "brk":

Solaris Linux IRIX
Perl 6 11 12
Python 47 39 25

...ok, looks like Greg's gripe about memory holds some water.

Rerunning "truss" on Solaris with "python -S -c 1" drastically reduces
the startup overhead as measured by "number of system calls". Some
quick timing experiments show a drastic speedup (in wall-clock time) by
adding "-S": about 37% faster under Solaris, 56% faster under Linux, and
35% under IRIX. These figures should be taken with a large grain of
salt, as the Linux and IRIX systems were fairly well loaded at the time,
and the wall-clock results I measured had huge variance. Still, it gets
the point across.

Oh, also for the record, all timings were done like:

perl -e 'for $i (1 .. 100) { system "python", "-S", "-c", "1"; }'

because I wanted to guarantee no shell was involved in the Python
startup.

Greg
--
Greg Ward - software developer gward [at] cnri
Corporation for National Research Initiatives
1895 Preston White Drive voice: +1-703-620-8990
Reston, Virginia, USA 20191-5434 fax: +1-703-620-0913


akuchlin at mems-exchange

Nov 16, 1999, 7:35 AM

Post #3 of 11 (419 views)
Permalink
Re: shared data (was: Some thoughts on the codecs...) [In reply to]

Greg Ward writes:
>Next, the number of "open" calls:
> Solaris Linux IRIX
> Perl 16 10 9
> Python 107 71 48

Running 'python -v' explains this:

amarok akuchlin>python -v
# /usr/local/lib/python1.5/exceptions.pyc matches /usr/local/lib/python1.5/exceptions.py
import exceptions # precompiled from /usr/local/lib/python1.5/exceptions.pyc
# /usr/local/lib/python1.5/site.pyc matches /usr/local/lib/python1.5/site.py
import site # precompiled from /usr/local/lib/python1.5/site.pyc
# /usr/local/lib/python1.5/os.pyc matches /usr/local/lib/python1.5/os.py
import os # precompiled from /usr/local/lib/python1.5/os.pyc
import posix # builtin
# /usr/local/lib/python1.5/posixpath.pyc matches /usr/local/lib/python1.5/posixpath.py
import posixpath # precompiled from /usr/local/lib/python1.5/posixpath.pyc
# /usr/local/lib/python1.5/stat.pyc matches /usr/local/lib/python1.5/stat.py
import stat # precompiled from /usr/local/lib/python1.5/stat.pyc
# /usr/local/lib/python1.5/UserDict.pyc matches /usr/local/lib/python1.5/UserDict.py
import UserDict # precompiled from /usr/local/lib/python1.5/UserDict.pyc
Python 1.5.2 (#80, May 25 1999, 18:06:07) [GCC 2.8.1] on sunos5
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
import readline # dynamically loaded from /usr/local/lib/python1.5/lib-dynload/readline.so

And each import tries several different forms of the module name:

stat("/usr/local/lib/python1.5/os", 0xEFFFD5E0) Err#2 ENOENT
open("/usr/local/lib/python1.5/os.so", O_RDONLY) Err#2 ENOENT
open("/usr/local/lib/python1.5/osmodule.so", O_RDONLY) Err#2 ENOENT
open("/usr/local/lib/python1.5/os.py", O_RDONLY) = 4

I don't see how this is fixable, unless we strip down site.py, which
drags in os, which drags in os.path and stat and UserDict.

--
A.M. Kuchling http://starship.python.net/crew/amk/
I'm going stir-crazy, and I've joined the ranks of the walking brain-dead, but
otherwise I'm just peachy.
-- Lyta Hall on parenthood, in SANDMAN #40: "Parliament of Rooks"


mal at lemburg

Nov 16, 1999, 7:36 AM

Post #4 of 11 (418 views)
Permalink
Re: shared data (was: Some thoughts on the codecs...) [In reply to]

Greg Ward wrote:
>
> > Go start Perl 100 times, then do the same with Python. Python is
> > significantly slower. I've actually written a web app in PHP because
> > another one that I did in Python had slow response time.
> > [. yah: the Real Man Answer is to write a real/good mod_python. ]
>
> I don't think this is the only factor in startup overhead. Try looking
> into the number of system calls for the trivial startup case of each
> interpreter:
>
> $ truss perl -e 1 2> perl.log
> $ truss python -c 1 2> python.log
>
> (This is on Solaris; I did the same thing on Linux with "strace", and on
> IRIX with "par -s -SS". Dunno about other Unices.) The results are
> interesting, and useful despite the platform and version disparities.
>
> (For the record: Python 1.5.2 on all three platforms; Perl 5.005_03 on
> Solaris, 5.004_05 on Linux, and 5.004_04 on IRIX. The Solaris is 2.6,
> using the Official CNRI Python Build by Barry, and the ditto Perl build
> by me; the Linux system is starship, using whatever Perl and Python the
> Starship Masters provide us with; the IRIX box is an elderly but
> well-maintained SGI Challenge running IRIX 5.3.)
>
> Also, this is with an empty PYTHONPATH. The Solaris build of Python has
> different prefix and exec_prefix, but on the Linux and IRIX builds, they
> are the same. (I think this will reflect poorly on the Solaris
> version.) PERLLIB, PERL5LIB, and Perl's builtin @INC should not affect
> startup of the trivial "1" script, so I haven't paid attention to them.

For kicks I've done a similar test with cgipython, the
one file version of Python 1.5.2:

> First, the size of log files (in lines), i.e. number of system calls:
>
> Solaris Linux IRIX[1]
> Perl 88 85 70
> Python 425 316 257

cgipython 182

> [1] after chopping off the summary counts from the "par" output -- ie.
> these really are the number of system calls, not the number of
> lines in the log files
>
> Next, the number of "open" calls:
>
> Solaris Linux IRIX
> Perl 16 10 9
> Python 107 71 48

cgipython 33

> (It looks as though *all* of the Perl 'open' calls are due to the
> dynamic linker going through /usr/lib and/or /lib.)
>
> And the number of unsuccessful "open" calls:
>
> Solaris Linux IRIX
> Perl 6 1 3
> Python 77 49 32

cgipython 28

Note that cgipython does search for sitecutomize.py.

>
> Number of "mmap" calls:
>
> Solaris Linux IRIX
> Perl 25 25 1
> Python 36 24 1

cgipython 13

>
> ...nope, guess we can't blame mmap for any Perl/Python startup
> disparity.
>
> How about "brk":
>
> Solaris Linux IRIX
> Perl 6 11 12
> Python 47 39 25

cgipython 41 (?)

So at least in theory, using cgipython for the intended
purpose should gain some performance.

--
Marc-Andre Lemburg
______________________________________________________________________
Y2000: 45 days left
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/


bwarsaw at cnri

Nov 16, 1999, 9:14 AM

Post #5 of 11 (417 views)
Permalink
Re: shared data (was: Some thoughts on the codecs...) [In reply to]

>>>>> "AMK" == Andrew M Kuchling <akuchlin [at] mems-exchange> writes:

AMK> I don't see how this is fixable, unless we strip down
AMK> site.py, which drags in os, which drags in os.path and stat
AMK> and UserDict.

One approach might be to support loading modules out of jar files (or
whatever) using Greg imputils. We could put the bootstrap .pyc files
in this jar and teach Python to import from it first. Python
installations could even craft their own modules.jar file to include
whatever modules they are willing to "hard code". This, with -S might
make Python start up much faster, at the small cost of some
flexibility (which could be regained with a c.l. switch or other
mechanism to bypass modules.jar).

-Barry


guido at CNRI

Nov 16, 1999, 9:27 AM

Post #6 of 11 (417 views)
Permalink
Re: shared data (was: Some thoughts on the codecs...) [In reply to]

> >>>>> "AMK" == Andrew M Kuchling <akuchlin [at] mems-exchange> writes:
>
> AMK> I don't see how this is fixable, unless we strip down
> AMK> site.py, which drags in os, which drags in os.path and stat
> AMK> and UserDict.
>
> One approach might be to support loading modules out of jar files (or
> whatever) using Greg imputils. We could put the bootstrap .pyc files
> in this jar and teach Python to import from it first. Python
> installations could even craft their own modules.jar file to include
> whatever modules they are willing to "hard code". This, with -S might
> make Python start up much faster, at the small cost of some
> flexibility (which could be regained with a c.l. switch or other
> mechanism to bypass modules.jar).

A completely different approach (which, incidentally, HP has lobbied
for before; and which has been implemented by Sjoerd Mullender for one
particular application) would be to cache a mapping from module names
to filenames in a dbm file. For Sjoerd's app (which imported hundreds
of modules) this made a huge difference. The problem is that it's
hard to deal with issues like updating the cache while sharing it with
other processes and even other users... But if those can be solved,
this could greatly reduce the number of stats and unsuccessful opens,
without having to resort to jar files.

--Guido van Rossum (home page: http://www.python.org/~guido/)


gmcm at hypernet

Nov 16, 1999, 9:56 AM

Post #7 of 11 (416 views)
Permalink
Re: shared data (was: Some thoughts on the codecs...) [In reply to]

Barry A. Warsaw writes:

> One approach might be to support loading modules out of jar files
> (or whatever) using Greg imputils. We could put the bootstrap
> .pyc files in this jar and teach Python to import from it first.
> Python installations could even craft their own modules.jar file
> to include whatever modules they are willing to "hard code".
> This, with -S might make Python start up much faster, at the
> small cost of some flexibility (which could be regained with a
> c.l. switch or other mechanism to bypass modules.jar).

Couple hundred Windows users have been doing this for
months (http://starship.python.net/crew/gmcm/install.html).
The .pyz files are cross-platform, although the "embedding"
app would have to be redone for *nix, (and all the embedding
really does is keep Python from hunting all over your disk).
Yeah, it's faster. And I can put Python+Tcl/Tk+IDLE on a
diskette with a little room left over.

but-since-its-WIndows-it-must-be-tainted-ly y'rs


- Gordon


gward at cnri

Nov 16, 1999, 10:54 AM

Post #8 of 11 (418 views)
Permalink
Re: shared data (was: Some thoughts on the codecs...) [In reply to]

On 16 November 1999, Guido van Rossum said:
> A completely different approach (which, incidentally, HP has lobbied
> for before; and which has been implemented by Sjoerd Mullender for one
> particular application) would be to cache a mapping from module names
> to filenames in a dbm file. For Sjoerd's app (which imported hundreds
> of modules) this made a huge difference.

Hey, this could be a big win for Zope startup. Dunno how much of that
20-30 sec startup overhead is due to loading modules, but I'm sure it's
a sizeable percentage. Any Zope-heads listening?

> The problem is that it's
> hard to deal with issues like updating the cache while sharing it with
> other processes and even other users...

Probably not a concern in the case of Zope: one installation, one
process, only gets started when it's explicitly shut down and
restarted. HmmmMMMMmmm...

Greg


petrilli at amber

Nov 16, 1999, 11:04 AM

Post #9 of 11 (421 views)
Permalink
Re: shared data (was: Some thoughts on the codecs...) [In reply to]

Greg Ward [gward [at] cnri] wrote:
> On 16 November 1999, Guido van Rossum said:
> > A completely different approach (which, incidentally, HP has lobbied
> > for before; and which has been implemented by Sjoerd Mullender for one
> > particular application) would be to cache a mapping from module names
> > to filenames in a dbm file. For Sjoerd's app (which imported hundreds
> > of modules) this made a huge difference.
>
> Hey, this could be a big win for Zope startup. Dunno how much of that
> 20-30 sec startup overhead is due to loading modules, but I'm sure it's
> a sizeable percentage. Any Zope-heads listening?

Wow, that's a huge start up that I've personally never seen. I can't
imagine... even loading the Oracle libraries dynamically, which are HUGE
(2Mb or so), it's only a couple seconds.

> > The problem is that it's
> > hard to deal with issues like updating the cache while sharing it with
> > other processes and even other users...
>
> Probably not a concern in the case of Zope: one installation, one
> process, only gets started when it's explicitly shut down and
> restarted. HmmmMMMMmmm...

This doesn't reslve a lot of other users of Python howver... and Zope
would always benefit, especially when you're running multiple instances
on th same machine... would perhaps share more code.

Chris
--
| Christopher Petrilli
| petrilli [at] amber


gstein at lyra

Nov 16, 1999, 7:03 PM

Post #10 of 11 (419 views)
Permalink
Re: shared data [In reply to]

On Tue, 16 Nov 1999, Gordon McMillan wrote:
> Barry A. Warsaw writes:
> > One approach might be to support loading modules out of jar files
> > (or whatever) using Greg imputils. We could put the bootstrap
> > .pyc files in this jar and teach Python to import from it first.
> > Python installations could even craft their own modules.jar file
> > to include whatever modules they are willing to "hard code".
> > This, with -S might make Python start up much faster, at the
> > small cost of some flexibility (which could be regained with a
> > c.l. switch or other mechanism to bypass modules.jar).
>
> Couple hundred Windows users have been doing this for
> months (http://starship.python.net/crew/gmcm/install.html).
> The .pyz files are cross-platform, although the "embedding"
> app would have to be redone for *nix, (and all the embedding
> really does is keep Python from hunting all over your disk).
> Yeah, it's faster. And I can put Python+Tcl/Tk+IDLE on a
> diskette with a little room left over.

I've got a patch from Jim Ahlstrom to provide a "standardized" library
file. I've got to review and fold that thing in (I'll post here when that
is done).

As Gordon states: yes, the startup time is considerably improved.

The DBM approach is interesting. That could definitely be used thru an
imputils Importer; it would be quite interesting to try that out.

(Note that the library style approach would be even harder to deal with
updates, relative to what Sjoerd saw with the DBM approach; I would guess
that the "right" approach is to rebuild the library from scratch and
atomically replace the thing (but that would bust people with open
references...))

Certainly something to look at.

Cheers,
-g

p.s. I also want to try mmap'ing a library and creating code objects that
use PyBufferObjects (rather than PyStringObjects) that refer to portions
of the mmap. Presuming the mmap is shared, there "should" be a large
reduction in heap usage. Question is that I don't know the proportion of
code bytes to other heap usage caused by loading a .pyc.

p.p.s. I also want to try the buffer approach for frozen code.

--
Greg Stein, http://www.lyra.org/


tim_one at email

Nov 17, 1999, 2:10 AM

Post #11 of 11 (419 views)
Permalink
RE: shared data (was: Some thoughts on the codecs...) [In reply to]

[Gordon McMillan]
> ...
> Yeah, it's faster. And I can put Python+Tcl/Tk+IDLE on a
> diskette with a little room left over.

That's truly remarkable (he says while waiting for the Inbox Repair Tool to
finish repairing his 50Mb Outlook mail file ...)!

> but-since-its-WIndows-it-must-be-tainted-ly y'rs

Indeed -- if it runs on Windows, it's a worthless piece o' crap <wink>.

Python dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.