Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Python: Dev

Import redesign (was: Python 1.6 status)

 

 

Python dev RSS feed   Index | Next | Previous | View Threaded


guido at CNRI

Nov 18, 1999, 8:30 AM

Post #1 of 8 (332 views)
Permalink
Import redesign (was: Python 1.6 status)

Gordon McMillan wrote:

> Marc-Andre wrote:
>
> > Fredrik Lundh wrote:
> >
> > > Guido van Rossum <guido [at] CNRI> wrote:
> > > > - suggestions for new issues that maybe ought to be settled in 1.6
> > >
> > > three things: imputil, imputil, imputil
> >
> > But please don't add the current version as default importer...
> > its strategy is way too slow for real life apps (yes, I've tested
> > this: imports typically take twice as long as with the builtin
> > importer).
>
> I think imputil's emulation of the builtin importer is more of a
> demonstration than a serious implementation. As for speed, it
> depends on the test.

Agreed. I like some of imputil's features, but I think the API
need to be redesigned.

> > I'd opt for an import manager which provides a useful API for
> > import hooks to register themselves with.
>
> I think that rather than blindly chain themselves together, there
> should be a simple minded manager. This could let the
> programmer prioritize them.

Indeed. (A list of importers has been suggested, to replace the list
of directories currently used.)

> > What we really need
> > is not yet another complete reimplementation of what the
> > builtin importer does, but rather a more detailed exposure of
> > the various import aspects: finding modules and loading modules.
>
> The first clause I sort of agree with - the current
> implementation is a fine implementation of a filesystem
> directory based importer.
>
> I strongly disagree with the second clause. The current import
> hooks are just such a detailed exposure; and they are
> incomprehensible and unmanagable.

Based on how many people have successfully written import hooks, I
have to agree. :-(

> I guess you want to tweak the "finding" part of the builtin
> import mechanism. But that's no reason to ask all importers
> to break themselves up into "find" and "load" pieces. It's a
> reason to ask that the standard importer be, in some sense,
> "subclassable" (ie, expose hooks, or perhaps be an extension
> class like thingie).

Agreed. Subclassing is a good way towards flexibility.

And Jim Ahlstrom writes:

> IMHO the current import mechanism is good for developers who must
> work on the library code in the directory tree, but a disaster
> for sysadmins who must distribute Python applications either
> internally to a number of machines or commercially.

Unfortunately, you're right. :-(

> What we need is a standard Python library file like a Java "Jar"
> file. Imputil can support this as 130 lines of Python. I have also
> written one in C. I like the imputil approach, but if we want to
> add a library importer to import.c, I volunteer to write it.

Please volunteer to design or at least review the grand architecture
-- see below.

> I don't want to just add more complicated and unmanageable hooks
> which people will all use different ways and just add to the
> confusion.

You're so right!

> It is easy to install packages by just making them into a library
> file and throwing it into a directory. So why aren't we doing it?

Rhetorical question. :-)

So here's a challenge: redesign the import API from scratch.

Let me start with some requirements.

Compatibility issues:
---------------------

- the core API may be incompatible, as long as compatibility layers
can be provided in pure Python

- support for rexec functionality

- support for freeze functionality

- load .py/.pyc/.pyo files and shared libraries from files

- support for packages

- sys.path and sys.modules should still exist; sys.path might
have a slightly different meaning

- $PYTHONPATH and $PYTHONHOME should still be supported

(I wouldn't mind a splitting up of importdl.c into several
platform-specific files, one of which is chosen by the configure
script; but that's a bit of a separate issue.)

New features:
-------------

- Integrated support for Greg Ward's distribution utilities (i.e. a
module prepared by the distutil tools should install painlessly)

- Good support for prospective authors of "all-in-one" packaging tool
authors like Gordon McMillan's win32 installer or /F's squish. (But
I *don't* require backwards compatibility for existing tools.)

- Standard import from zip or jar files, in two ways:

(1) an entry on sys.path can be a zip/jar file instead of a directory;
its contents will be searched for modules or packages

(2) a file in a directory that's on sys.path can be a zip/jar file;
its contents will be considered as a package (note that this is
different from (1)!)

I don't particularly care about supporting all zip compression
schemes; if Java gets away with only supporting gzip compression
in jar files, so can we.

- Easy ways to subclass or augment the import mechanism along
different dimensions. For example, while none of the following
features should be part of the core implementation, it should be
easy to add any or all:

- support for a new compression scheme to the zip importer

- support for a new archive format, e.g. tar

- a hook to import from URLs or other data sources (e.g. a
"module server" imported in CORBA) (this needn't be supported
through $PYTHONPATH though)

- a hook that imports from compressed .py or .pyc/.pyo files

- a hook to auto-generate .py files from other filename
extensions (as currently implemented by ILU)

- a cache for file locations in directories/archives, to improve
startup time

- a completely different source of imported modules, e.g. for an
embedded system or PalmOS (which has no traditional filesystem)

- Note that different kinds of hooks should (ideally, and within
reason) properly combine, as follows: if I write a hook to recognize
.spam files and automatically translate them into .py files, and you
write a hook to support a new archive format, then if both hooks are
installed together, it should be possible to find a .spam file in an
archive and do the right thing, without any extra action. Right?

- It should be possible to write hooks in C/C++ as well as Python

- Applications embedding Python may supply their own implementations,
default search path, etc., but don't have to if they want to piggyback
on an existing Python installation (even though the latter is
fraught with risk, it's cheaper and easier to understand).

Implementation:
---------------

- There must clearly be some code in C that can import certain
essential modules (to solve the chicken-or-egg problem), but I don't
mind if the majority of the implementation is written in Python.
Using Python makes it easy to subclass.

- In order to support importing from zip/jar files using compression,
we'd at least need the zlib extension module and hence libz itself,
which may not be available everywhere.

- I suppose that the bootstrap is solved using a mechanism very
similar to what freeze currently used (other solutions seem to be
platform dependent).

- I also want to still support importing *everything* from the
filesystem, if only for development. (It's hard enough to deal with
the fact that exceptions.py is needed during Py_Initialize();
I want to be able to hack on the import code written in Python
without having to rebuild the executable all the time.

Let's first complete the requirements gathering. Are these
requirements reasonable? Will they make an implementation too
complex? Am I missing anything?

Finally, to what extent does this impact the desire for dealing
differently with the Python bytecode compiler (e.g. supporting
optimizers written in Python)? And does it affect the desire to
implement the read-eval-print loop (the >>> prompt) in Python?

--Guido van Rossum (home page: http://www.python.org/~guido/)


jim at interet

Nov 18, 1999, 12:23 PM

Post #2 of 8 (320 views)
Permalink
Re: Import redesign (was: Python 1.6 status) [In reply to]

Guido van Rossum wrote:
>
> Let's first complete the requirements gathering.

Yes.

> Are these
> requirements reasonable? Will they make an implementation too
> complex?

I think you can get 90% of where you want to be with something
much simpler. And the simpler implementation will be useful in
the 100% solution, so it is not wasted time.

How about if we just design a Python archive file format; provide
code in the core (in Python or C) to import from it; provide a
Python program to create archive files; and provide a Standard
Directory to put archives in so they can be found quickly. For
extensibility and control, we add functions to the imp module.
Detailed comments follow:


> Compatibility issues:
> ---------------------
> [list of current features...]

Easily met by keeping the current C code.

>
> New features:
> -------------
>
> - Integrated support for Greg Ward's distribution utilities (i.e. a
> module prepared by the distutil tools should install painlessly)
>
> - Good support for prospective authors of "all-in-one" packaging tool
> authors like Gordon McMillan's win32 installer or /F's squish. (But
> I *don't* require backwards compatibility for existing tools.)

These tools go well beyond just an archive file format, but hopefully
a file format will help. Greg and Gordon should be able to control the
format so it meets their needs. We need a standard format.

> - Standard import from zip or jar files, in two ways:
>
> (1) an entry on sys.path can be a zip/jar file instead of a directory;
> its contents will be searched for modules or packages
>
> (2) a file in a directory that's on sys.path can be a zip/jar file;
> its contents will be considered as a package (note that this is
> different from (1)!)

I don't like sys.path at all. It is currently part of the problem.
I suggest that archive files MUST be put into a known directory.
On Windows this is the directory of the executable, sys.executable.
On Unix this $PREFIX plus version, namely
"%s/lib/python%s/" % (sys.prefix, sys.version[0:3]).
Other platforms can have different rules.

We should also have the ability to append archive files to the
executable or a shared library assuming the OS allows this
(Windows and Linux do allow it). This is the first location
searched, nails the archive to the interpreter, insulates us
from an erroneous sys.path, and enables single-file Python programs.

> I don't particularly care about supporting all zip compression
> schemes; if Java gets away with only supporting gzip compression
> in jar files, so can we.

We don't need compression. The whole ./Lib is 1.2 Meg, and if we
compress
it to zero we save a Meg. Irrelevant. Installers provide compression
anyway so when Python programs are shipped, they will be compressed
then.

Problems are that Python does not ship with compression, we will
have to add it, we will have to support it and its current method
of compression forever, and it adds complexity.

> - Easy ways to subclass or augment the import mechanism along
> different dimensions. For example, while none of the following
> features should be part of the core implementation, it should be
> easy to add any or all:
>
> [ List of new features including hooks...]

Sigh, this proposal does not provide for this. It seems
like a job for imputil. But if the file format and import code
is available from the imp module, it can be used as part of the
solution.

> - support for a new compression scheme to the zip importer

I guess compression should be easy to add if Python ships with
a compression module.

> - a cache for file locations in directories/archives, to improve
> startup time

If the Python library is available as an archive, I think
startup will be greatly improved anyway.

> Implementation:
> ---------------
>
> - There must clearly be some code in C that can import certain
> essential modules (to solve the chicken-or-egg problem), but I don't
> mind if the majority of the implementation is written in Python.
> Using Python makes it easy to subclass.

Yes.

> - In order to support importing from zip/jar files using compression,
> we'd at least need the zlib extension module and hence libz itself,
> which may not be available everywhere.

That's a good reason to omit compression. At least for now.

> - I suppose that the bootstrap is solved using a mechanism very
> similar to what freeze currently used (other solutions seem to be
> platform dependent).

Yes, except that we need to be careful to preserve the freeze feature
for users. We don't want to take it over.

> - I also want to still support importing *everything* from the
> filesystem, if only for development. (It's hard enough to deal with
> the fact that exceptions.py is needed during Py_Initialize();
> I want to be able to hack on the import code written in Python
> without having to rebuild the executable all the time.

Yes, we need a function in imp to turn archives off:
import imp
imp.archiveEnable(0)

> Finally, to what extent does this impact the desire for dealing
> differently with the Python bytecode compiler (e.g. supporting
> optimizers written in Python)? And does it affect the desire to
> implement the read-eval-print loop (the >>> prompt) in Python?

I don't think it impacts these at all.

Jim Ahlstrom


guido at CNRI

Nov 18, 1999, 12:55 PM

Post #3 of 8 (320 views)
Permalink
Re: Import redesign (was: Python 1.6 status) [In reply to]

> I think you can get 90% of where you want to be with something
> much simpler. And the simpler implementation will be useful in
> the 100% solution, so it is not wasted time.

Agreed, but I'm not sure that it addresses the problems that started
this thread. I can't really tell, since the message starting the
thread just requested imputil, without saying which parts of it were
needed. A followup claimed that imputil was a fine prototype but too
slow for real work.

I inferred that flexibility was requested. But maybe that was
projection since that was on my own list. (I'm happy with the
performance and find manipulating zip or jar files clumsy, so I'm not
too concerned about all the nice things you can *do* with that
flexibility. :-)

> How about if we just design a Python archive file format; provide
> code in the core (in Python or C) to import from it; provide a
> Python program to create archive files; and provide a Standard
> Directory to put archives in so they can be found quickly. For
> extensibility and control, we add functions to the imp module.
> Detailed comments follow:

> These tools go well beyond just an archive file format, but hopefully
> a file format will help. Greg and Gordon should be able to control the
> format so it meets their needs. We need a standard format.

I think the standard format should be a subclass of zip or jar (which
is itself a subclass of zip). We have already written (at CNRI, as
yet unreleased) the necessary Python tools to manipulate zip archives;
moreover 3rd party tools are abundantly available, both on Unix and on
Windows (as well as in Java). Zip files also lend themselves to
self-extracting archives and similar things, because the file index is
at the end, so I think that Greg & Gordon should be happy.

> I don't like sys.path at all. It is currently part of the problem.

Eh? That's the first thing I hear something bad about it. Maybe
that's because you live on Windows -- on Unix, search paths are
ubiquitous.

> I suggest that archive files MUST be put into a known directory.

Why? Maybe this works on Windows; on Unix this is asking for trouble
because it prevents users from augmenting the installation provided by
the sysadmin. Even on newer Windows versions, users without admin
perms may not be allowed to add files to that privileged directory.

> On Windows this is the directory of the executable, sys.executable.
> On Unix this $PREFIX plus version, namely
> "%s/lib/python%s/" % (sys.prefix, sys.version[0:3]).
> Other platforms can have different rules.
>
> We should also have the ability to append archive files to the
> executable or a shared library assuming the OS allows this
> (Windows and Linux do allow it). This is the first location
> searched, nails the archive to the interpreter, insulates us
> from an erroneous sys.path, and enables single-file Python programs.

OK for the executable. I'm not sure what the point is of appending an
archive to the shared library? Anyway, does it matter (on Windows) if
you add it to python16.dll or to python.exe?

> We don't need compression. The whole ./Lib is 1.2 Meg, and if we
> compress
> it to zero we save a Meg. Irrelevant. Installers provide compression
> anyway so when Python programs are shipped, they will be compressed
> then.
>
> Problems are that Python does not ship with compression, we will
> have to add it, we will have to support it and its current method
> of compression forever, and it adds complexity.

OK, OK. I think most zip tools have a way to turn off the
compression. (Anyway, it's a matter of more I/O time vs. more CPU
time; hardare for both is getting better faster than we can tweak the
code :-)

> Sigh, this proposal does not provide for this. It seems
> like a job for imputil. But if the file format and import code
> is available from the imp module, it can be used as part of the
> solution.

Well, the question is really if we want flexibility or archive files.
I care more about the flexibility. If we get a clear vote for archive
files, I see no problem with implementing that first.

> If the Python library is available as an archive, I think
> startup will be greatly improved anyway.

Really? I know about all the system calls it makes, but I don't
really see much of a delay -- I have a prompt in well under 0.1
second.

--Guido van Rossum (home page: http://www.python.org/~guido/)


jim at interet

Nov 18, 1999, 3:40 PM

Post #4 of 8 (321 views)
Permalink
Re: Import redesign (was: Python 1.6 status) [In reply to]

Guido van Rossum wrote:

> I think the standard format should be a subclass of zip or jar (which
> is itself a subclass of zip). We have already written (at CNRI, as
> yet unreleased) the necessary Python tools to manipulate zip archives;
> moreover 3rd party tools are abundantly available, both on Unix and on
> Windows (as well as in Java). Zip files also lend themselves to
> self-extracting archives and similar things, because the file index is
> at the end, so I think that Greg & Gordon should be happy.

Think about multiple packages in multiple zip files. The zip files
store file directories. That means we would need a sys.zippath to
search the zip files. I don't want another PYTHONPATH phenomenon.

Greg Stein and I once discussed this (and Gordon I think). They
argued that the directories should be flattened. That is, think of
all directories which can be reached on PYTHONPATH. Throw
away all initial paths. The resultant archive has *.pyc at the top
level,
as well as package directories only. The search path is "." in every
archive file. No directory information is stored, only module names,
some with dots.

> > I don't like sys.path at all. It is currently part of the problem.
>
> Eh? That's the first thing I hear something bad about it. Maybe
> that's because you live on Windows -- on Unix, search paths are
> ubiquitous.

On windows, just print sys.path. It is junk. A commercial
distribution has to "just work", and it fails if a second installation
(by someone else) changes PYTHONPATH to suit their app. I am trying
to get to "just works", no excuses, no complications.

> > I suggest that archive files MUST be put into a known directory.
>
> Why? Maybe this works on Windows; on Unix this is asking for trouble
> because it prevents users from augmenting the installation provided by
> the sysadmin. Even on newer Windows versions, users without admin
> perms may not be allowed to add files to that privileged directory.

It works on Windows because programs install themselves in their own
subdirectories, and can put files there instead of /windows/system32.
This holds true for Windows 2000 also. A Unix-style installation
to /windows/system32 would (may?) require "administrator" privilege.

On Unix you are right. I didn't think of that because I am the Unix
sysadmin here, so I can put things where I want. The Windows
solution doesn't fit with Unix, because executables go in a ./bin
directory and putting library files there is a no-no. Hmmmm...
This needs more thought. Anyone else have ideas??

> > We should also have the ability to append archive files to the
> > executable or a shared library assuming the OS allows this
>
> OK for the executable. I'm not sure what the point is of appending an
> archive to the shared library? Anyway, does it matter (on Windows) if
> you add it to python16.dll or to python.exe?

The point of using python16.dll is to append the Python library to
it, and append to python.exe (or use files) for everything else.
That way, the 1.6 interpreter is linked to the 1.6 Lib, upgrading
to 1.7 means replacing only one file, and there is no wasted storage
in multiple Lib's. I am thinking of multiple Python programs in
different directories.

But maybe you are right. On Windows, if python.exe can be put in
/windows/system32 then it really doesn't matter.

> OK, OK. I think most zip tools have a way to turn off the
> compression. (Anyway, it's a matter of more I/O time vs. more CPU
> time; hardare for both is getting better faster than we can tweak the
> code :-)

Well, if Python now has its own compression that is built
in and comes with it, then that is different. Maybe compression
is OK.

> Well, the question is really if we want flexibility or archive files.
> I care more about the flexibility. If we get a clear vote for archive
> files, I see no problem with implementing that first.

I don't like flexibility, I like standardization and simplicity.
Flexibility just encourages users to do the wrong thing.

Everyone vote please. I don't have a solid feeling about
what people want, only what they don't like.

> > If the Python library is available as an archive, I think
> > startup will be greatly improved anyway.
>
> Really? I know about all the system calls it makes, but I don't
> really see much of a delay -- I have a prompt in well under 0.1
> second.

So do I. I guess I was just echoing someone else's complaint.

JimA


gmcm at hypernet

Nov 18, 1999, 7:23 PM

Post #5 of 8 (319 views)
Permalink
Re: Import redesign (was: Python 1.6 status) [In reply to]

[Guido]
> > I think the standard format should be a subclass of zip or jar
> > (which is itself a subclass of zip). We have already written
> > (at CNRI, as yet unreleased) the necessary Python tools to
> > manipulate zip archives; moreover 3rd party tools are
> > abundantly available, both on Unix and on Windows (as well as
> > in Java). Zip files also lend themselves to self-extracting
> > archives and similar things, because the file index is at the
> > end, so I think that Greg & Gordon should be happy.

No problem (I created my own formats for relatively minor
reasons).

[JimA]
> Think about multiple packages in multiple zip files. The zip
> files store file directories. That means we would need a
> sys.zippath to search the zip files. I don't want another
> PYTHONPATH phenomenon.

What if sys.path looked like:
[DirImporter('.'), ZlibImporter('c:/python/stdlib.pyz'), ...]

> Greg Stein and I once discussed this (and Gordon I think). They
> argued that the directories should be flattened. That is, think
> of all directories which can be reached on PYTHONPATH. Throw
> away all initial paths. The resultant archive has *.pyc at the
> top level, as well as package directories only. The search path
> is "." in every archive file. No directory information is
> stored, only module names, some with dots.

While I do flat archives (no dots, but that's a different story),
there's no reason the archive couldn't be structured. Flat
archives are definitely simpler.

[JimA]
> > > I don't like sys.path at all. It is currently part of the
> > > problem.
[Guido]
> > Eh? That's the first thing I hear something bad about it.
> > Maybe that's because you live on Windows -- on Unix, search
> > paths are ubiquitous.
>
> On windows, just print sys.path. It is junk. A commercial
> distribution has to "just work", and it fails if a second
> installation (by someone else) changes PYTHONPATH to suit their
> app. I am trying to get to "just works", no excuses, no
> complications.

Py_Initialize ();
PyRun_SimpleString ("import sys; del sys.path[1:]");

Yeah, there's a hole there. Fixable if you could do a little pre-
Py_Initialize twiddling.

> > > I suggest that archive files MUST be put into a known
> > > directory.

No way. Hard code a directory? Overwrite someone else's
Python "standalone"? Write to a C: partition that is
deliberately sized to hold nothing but Windows? Make
network installations impossible?

> > Why? Maybe this works on Windows; on Unix this is asking for
> > trouble because it prevents users from augmenting the
> > installation provided by the sysadmin. Even on newer Windows
> > versions, users without admin perms may not be allowed to add
> > files to that privileged directory.
>
> It works on Windows because programs install themselves in their
> own subdirectories, and can put files there instead of
> /windows/system32. This holds true for Windows 2000 also. A
> Unix-style installation to /windows/system32 would (may?) require
> "administrator" privilege.

There's nothing Unix-style about installing to
/Windows/system32. 'Course *they* have symbolic links that
actually work...

> On Unix you are right. I didn't think of that because I am the
> Unix sysadmin here, so I can put things where I want. The
> Windows solution doesn't fit with Unix, because executables go in
> a ./bin directory and putting library files there is a no-no.
> Hmmmm... This needs more thought. Anyone else have ideas??

The official Windows solution is stuff in registry about app
paths and such. Putting the dlls in the exe's directory is a
workaround which works and is more managable than the
official solution.

> > > We should also have the ability to append archive files to
> > > the executable or a shared library assuming the OS allows
> > > this

That's a handy trick on Windows, but it's got nothing to do
with Python.

> > Well, the question is really if we want flexibility or archive
> > files. I care more about the flexibility. If we get a clear
> > vote for archive files, I see no problem with implementing that
> > first.
>
> I don't like flexibility, I like standardization and simplicity.
> Flexibility just encourages users to do the wrong thing.

I've noticed that the people who think there should only be one
way to do things never agree on what it is.

> Everyone vote please. I don't have a solid feeling about
> what people want, only what they don't like.

Flexibility. You can put Christian's favorite Einstein quote here
too.

> > > If the Python library is available as an archive, I think
> > > startup will be greatly improved anyway.
> >
> > Really? I know about all the system calls it makes, but I
> > don't really see much of a delay -- I have a prompt in well
> > under 0.1 second.
>
> So do I. I guess I was just echoing someone else's complaint.

Install some stuff. Deinstall some of it. Repeat (mixing up the
order) until your registry and hard drive are shattered into tiny
little fragments. It doesn't take long (there's lots of stuff a
defragmenter can't touch once it's there).


- Gordon


mal at lemburg

Nov 19, 1999, 2:22 AM

Post #6 of 8 (322 views)
Permalink
Re: Import redesign (was: Python 1.6 status) [In reply to]

Guido van Rossum wrote:
>
> Let's first complete the requirements gathering. Are these
> requirements reasonable? Will they make an implementation too
> complex? Am I missing anything?

Since you were asking: I would like functionality equivalent
to my latest import patch for a slightly different lookup scheme
for module import inside packages to become a core feature.

If it becomes a core feature I promise to never again start
threads about relative imports :-)

Here's the summary again:
"""
[The patch] changes the default import mechanism to work like this:

>>> import d # from directory a/b/c/
try a.b.c.d
try a.b.d
try a.d
try d
fail

instead of just doing the current two-level lookup:

>>> import d # from directory a/b/c/
try a.b.c.d
try d
fail

As a result, relative imports referring to higher level packages
work out of the box without any ugly underscores in the import name.
Plus the whole scheme is pretty simple to explain and straightforward.
"""

You can find the patch attached to the message "Walking up the package
hierarchy" in the python-dev mailing list archive.

--
Marc-Andre Lemburg
______________________________________________________________________
Y2000: 42 days left
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/


gmcm at hypernet

Nov 19, 1999, 11:22 AM

Post #7 of 8 (327 views)
Permalink
Re: Import redesign (was: Python 1.6 status) [In reply to]

[Guido]
> Compatibility issues:
> ---------------------
>
> - the core API may be incompatible, as long as compatibility
> layers can be provided in pure Python

Good idea. Question: we have keyword import, __import__,
imp and PyImport_*. Which of those (if any) define the "core
API"?

[rexec, freeze: yes]

> - load .py/.pyc/.pyo files and shared libraries from files

Shared libraries? Might that not involve some rather shady
platform-specific magic? If it can be kept kosher, I'm all for it;
but I'd say no if it involved, um, undocumented features.

> support for packages

Absolutely. I'll just comment that the concept of
package.__path__ is also affected by the next point.
>
> - sys.path and sys.modules should still exist; sys.path might
> have a slightly different meaning
>
> - $PYTHONPATH and $PYTHONHOME should still be supported

If sys.path changes meaning, should not $PYTHONPATH
also?

> New features:
> -------------
>
> - Integrated support for Greg Ward's distribution utilities (i.e.
> a
> module prepared by the distutil tools should install
> painlessly)

I assume that this is mostly a matter of $PYTHONPATH and
other path manipulation mechanisms?

> - Good support for prospective authors of "all-in-one" packaging
> tool
> authors like Gordon McMillan's win32 installer or /F's squish.
> (But I *don't* require backwards compatibility for existing
> tools.)

I guess you've forgotten: I'm that *really* tall guy <wink>.

> - Standard import from zip or jar files, in two ways:
>
> (1) an entry on sys.path can be a zip/jar file instead of a
> directory;
> its contents will be searched for modules or packages

I don't mind this, but it depends on whether sys.path changes
meaning.

> (2) a file in a directory that's on sys.path can be a zip/jar
> file;
> its contents will be considered as a package (note that
> this is different from (1)!)

But it's affected by the same considerations (eg, do we start
with filesystem names and wrap them in importers, or do we
just start with importer instances / specifications for importer
instances).

> I don't particularly care about supporting all zip compression
> schemes; if Java gets away with only supporting gzip
> compression in jar files, so can we.

I think this is a matter of what zip compression is officially
blessed. I don't mind if it's none; providing / creating zipped
versions for platforms that support it is nearly trivial.

> - Easy ways to subclass or augment the import mechanism along
> different dimensions. For example, while none of the following
> features should be part of the core implementation, it should
> be easy to add any or all:
>
> - support for a new compression scheme to the zip importer
>
> - support for a new archive format, e.g. tar
>
> - a hook to import from URLs or other data sources (e.g. a
> "module server" imported in CORBA) (this needn't be supported
> through $PYTHONPATH though)

Which begs the question of the meaning of sys.path; and if it's
still filesystem names, how do you get one of these in there?

> - a hook that imports from compressed .py or .pyc/.pyo files
>
> - a hook to auto-generate .py files from other filename
> extensions (as currently implemented by ILU)
>
> - a cache for file locations in directories/archives, to
> improve
> startup time
>
> - a completely different source of imported modules, e.g. for
> an
> embedded system or PalmOS (which has no traditional
> filesystem)
>
> - Note that different kinds of hooks should (ideally, and within
> reason) properly combine, as follows: if I write a hook to
> recognize .spam files and automatically translate them into .py
> files, and you write a hook to support a new archive format,
> then if both hooks are installed together, it should be
> possible to find a .spam file in an archive and do the right
> thing, without any extra action. Right?

A bit of discussion: I've got 2 kinds of archives. One can
contain anything & is much like a zip (and probably should be
a zip). The other contains only compressed .pyc or .pyo. The
latter keys contents by logical name, not filesystem name. No
extensions, and when a package is imported, the code object
returned is the __init__ code object, (vs returning None and
letting the import mechanism come back and ask for
package.__init__).

When you're building an archive, you have to go thru the .py /
.pyc / .pyo / is it a package / maybe compile logic anyway.
Why not get it all over with, so that at runtime there's no
choices to be made.

Which means (for this kind of archive) that including
somebody's .spam in your archive isn't a matter of a hook, but
a matter of adding to the archive's build smarts.

> - It should be possible to write hooks in C/C++ as well as Python
>
> - Applications embedding Python may supply their own
> implementations,
> default search path, etc., but don't have to if they want to
> piggyback on an existing Python installation (even though the
> latter is fraught with risk, it's cheaper and easier to
> understand).

A way of tweaking that which will become sys.path before
Py_Initialize would be *most* welcome.

> Implementation:
> ---------------
>
> - There must clearly be some code in C that can import certain
> essential modules (to solve the chicken-or-egg problem), but I
> don't mind if the majority of the implementation is written in
> Python. Using Python makes it easy to subclass.
>
> - In order to support importing from zip/jar files using
> compression,
> we'd at least need the zlib extension module and hence libz
> itself, which may not be available everywhere.
>
> - I suppose that the bootstrap is solved using a mechanism very
> similar to what freeze currently used (other solutions seem to
> be platform dependent).

There are other possibilites here, but I have only half-
formulated ideas at the moment. The critical part for
embedding is to be able to *completely* control all path
related logic.

> - I also want to still support importing *everything* from the
> filesystem, if only for development. (It's hard enough to deal
> with the fact that exceptions.py is needed during
> Py_Initialize(); I want to be able to hack on the import code
> written in Python without having to rebuild the executable all
> the time.
>
> Let's first complete the requirements gathering. Are these
> requirements reasonable? Will they make an implementation too
> complex? Am I missing anything?

I'll summarize as follows:
1) What "sys.path" means (and how it's construction can be
manipulated) is critical.
2) See 1.

> Finally, to what extent does this impact the desire for dealing
> differently with the Python bytecode compiler (e.g. supporting
> optimizers written in Python)? And does it affect the desire to
> implement the read-eval-print loop (the >>> prompt) in
Python?

I can assure you that code.py runs fine out of an archive :-).

- Gordon


jim at interet

Nov 22, 1999, 10:25 AM

Post #8 of 8 (326 views)
Permalink
Re: Import redesign (was: Python 1.6 status) [In reply to]

Gordon McMillan wrote:

> [JimA]
> > Think about multiple packages in multiple zip files. The zip
> > files store file directories. That means we would need a
> > sys.zippath to search the zip files. I don't want another
> > PYTHONPATH phenomenon.
>
> What if sys.path looked like:
> [DirImporter('.'), ZlibImporter('c:/python/stdlib.pyz'), ...]

Well, that changes the current meaning of sys.path.

> > > > I suggest that archive files MUST be put into a known
> > > > directory.
>
> No way. Hard code a directory? Overwrite someone else's
> Python "standalone"? Write to a C: partition that is
> deliberately sized to hold nothing but Windows? Make
> network installations impossible?

Ooops. I didn't mean a known directory you couldn't change.
But I did mean a directory you shouldn't change.

But you are right. The directory should be configurable. But
I would still like to see a highly encouraged directory. I
don't yet have a good design for this. Anyone have ideas on an
official way to find library files?

I think a Python library file is a Good Thing, but it is not useful if
the archive can't be found.

I am thinking of a busy SysAdmin with someone nagging him/her to
install Python. SysAdmin doesn't want another headache. What if
Python becomes popular and users want it on Unix and PC's? More
work! There should be a standard way to do this that just works
and is dumb-stupid-simple. This is a Python promotion issue. Yes
everyone here can make sys.path work, but that is not the point.

> The official Windows solution is stuff in registry about app
> paths and such. Putting the dlls in the exe's directory is a
> workaround which works and is more managable than the
> official solution.

I agree completely.

> > > > We should also have the ability to append archive files to
> > > > the executable or a shared library assuming the OS allows
> > > > this
>
> That's a handy trick on Windows, but it's got nothing to do
> with Python.

It also works on Linux. I don't know about other systems.

> Flexibility. You can put Christian's favorite Einstein quote here
> too.

I hope we can still have ease of use with all this flexibility.
As I said, we need to promote Python.

Jim Ahlstrom

Python dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.