Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Python: Bugs

[issue14657] Avoid two importlib copies

 

 

First page Previous page 1 2 3 Next page Last page  View All Python bugs RSS feed   Index | Next | Previous | View Threaded


report at bugs

Apr 24, 2012, 2:36 PM

Post #26 of 73 (414 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Brett Cannon <brett [at] python> added the comment:

That's why I was thinking of tying into Modules/getpath.c because I assume that would work cross-platform. Is that incorrect?

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 24, 2012, 2:37 PM

Post #27 of 73 (393 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Antoine Pitrou <pitrou [at] free> added the comment:

> That's why I was thinking of tying into Modules/getpath.c because I
> assume that would work cross-platform. Is that incorrect?

Windows uses PC/getpathp.c, not Modules/getpath.c (with tons of
duplicate code)... So you would have to tie into both :)

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 24, 2012, 2:46 PM

Post #28 of 73 (400 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Marc-Andre Lemburg <mal [at] egenix> added the comment:

Brett Cannon wrote:
>
> Modules/getpath.c seems to be where the C code does it when getting paths for sys.path. So it would be possible to use that same algorithm to set some sys attribute (e.g. in_checkout or something) much like sys.gettotalrefcount is optional and only shown when built with --with-pydebug. Otherwise some directory structure check could be done (e.g. find importlib/_bootstrap.py off of sys.path, and then see if ../Modules/Setup or something also exists that would never show up in an installed CPython).

Why not simply use a flag that get's set based on an environment
variable, say PYTHONDEVMODE ?

Adding more cruft to getpath.c or similar routines is just going to
slow down startup time even more...

Python 2.7 has a startup time of 70ms on my machine; compare that to
Python 2.1 with 10ms and
Perl 5 with just 4ms.

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 24, 2012, 2:52 PM

Post #29 of 73 (395 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Antoine Pitrou <pitrou [at] free> added the comment:

> Adding more cruft to getpath.c or similar routines is just going to
> slow down startup time even more...

The code is already there.

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 24, 2012, 3:21 PM

Post #30 of 73 (395 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Marc-Andre Lemburg <mal [at] egenix> added the comment:

Antoine Pitrou wrote:
>
>> Adding more cruft to getpath.c or similar routines is just going to
>> slow down startup time even more...
>
> The code is already there.

Code to detect whether you're running off a checkout vs. a normal
installation by looking at even more directories ? I don't
see any in getpath.c (and that's good).

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 24, 2012, 3:29 PM

Post #31 of 73 (396 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Antoine Pitrou <pitrou [at] free> added the comment:

> Code to detect whether you're running off a checkout vs. a normal
> installation by looking at even more directories ? I don't
> see any in getpath.c (and that's good).

Look for "pybuilddir.txt".

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 24, 2012, 6:13 PM

Post #32 of 73 (395 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Brett Cannon <brett [at] python> added the comment:

That solves the "I'm in a checkout" problem but it doesn't tell you necessarily where the Lib directory is if you e.g. build from within another directory like Python/, which places the executable and pybuilddir.txt in the current directory.

Now obviously you could argue supporting that case is not worth it for development-sake, but I figured since I knew it existed I should put it out there.

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 25, 2012, 2:07 AM

Post #33 of 73 (394 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Marc-Andre Lemburg <mal [at] egenix> added the comment:

Antoine Pitrou wrote:
>
> Antoine Pitrou <pitrou [at] free> added the comment:
>
>> Code to detect whether you're running off a checkout vs. a normal
>> installation by looking at even more directories ? I don't
>> see any in getpath.c (and that's good).
>
> Look for "pybuilddir.txt".

Oh dear. Another one of those hacks... why wasn't this done using
constants passed in by the configure script and simple string
comparison ?

BTW: The startup time of python3.3 is 113ms on my machine, that's
more than twice as long as python2.7. Given the history, it
looks like no one cares about these things anymore... :-(

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 25, 2012, 4:49 AM

Post #34 of 73 (397 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Antoine Pitrou <pitrou [at] free> added the comment:

> > Look for "pybuilddir.txt".
>
> Oh dear. Another one of those hacks... why wasn't this done using
> constants passed in by the configure script and simple string
> comparison ?

How would that help distinguish between an installed Python and a
non-installed Python? If you have an idea about that, please open an
issue and explain it precisely :)

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 25, 2012, 5:06 AM

Post #35 of 73 (411 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Marc-Andre Lemburg <mal [at] egenix> added the comment:

Antoine Pitrou wrote:
>
> Antoine Pitrou <pitrou [at] free> added the comment:
>
>>> Look for "pybuilddir.txt".
>>
>> Oh dear. Another one of those hacks... why wasn't this done using
>> constants passed in by the configure script and simple string
>> comparison ?
>
> How would that help distinguish between an installed Python and a
> non-installed Python? If you have an idea about that, please open an
> issue and explain it precisely :)

The question pybuildir.txt apparently tries to solve is whether Python
is running from the build dir or not. It's not whether Python was
installed or not. Checking for the build dir can be done by looking
at the argv[0] of the executable and comparing that to the build dir.
This can be compiled into the interpreter using a constant, say
BUILDIR. At runtime, you'd simply compare the current argv[0] to
the BUILDDIR. If it matches, you know that you can assume the
build dir layout with reasonable certainty and proceed accordingly.
No need for extra joins, file reads, etc.

But given the enormous startup time of Python 3.3, those few stats
won't make a difference anyway. This would need a completely different
holistic approach. Perhaps something for SoC project.

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 25, 2012, 5:12 AM

Post #36 of 73 (397 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Antoine Pitrou <pitrou [at] free> added the comment:

> The question pybuildir.txt apparently tries to solve is whether Python
> is running from the build dir or not. It's not whether Python was
> installed or not.

That's the same, for all we're concerned.
But pybuilddir.txt does not only solve that problem. It also contains
the path to extension modules generated by setup.py, so that sys.path
can be setup appropriately at startup.

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 25, 2012, 5:28 AM

Post #37 of 73 (396 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Marc-Andre Lemburg <mal [at] egenix> added the comment:

Antoine Pitrou wrote:
>
> Antoine Pitrou <pitrou [at] free> added the comment:
>
>> The question pybuildir.txt apparently tries to solve is whether Python
>> is running from the build dir or not. It's not whether Python was
>> installed or not.
>
> That's the same, for all we're concerned.
> But pybuilddir.txt does not only solve that problem. It also contains
> the path to extension modules generated by setup.py, so that sys.path
> can be setup appropriately at startup.

Would be easier to tell distutils to install the extensions
in a fixed name dir (instead of using a platform and version
in the name) and then use that getpath.c. distutils is pretty
flexible at that :-)

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 25, 2012, 5:32 AM

Post #38 of 73 (396 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Nick Coghlan <ncoghlan [at] gmail> added the comment:

Still no patch from me, but I did create the rudiments of a shared script for poking around at the import internals (Tools/scripts/import_diagnostics.py)

Looking at Antoine's patch, I'd be happier with it if it *didn't* mutate the attributes of _frozen_importlib, but instead just added importlib._bootstrap as an alias for accessing it.

That would bring it in line with the way we handle os.path as being just an alias for the appropriate top level module:

>>> import os.path
>>> os.path.__name__
'posixpath'

Getting access to the source level _bootstrap implementation for testing purposes would then just require the usual techniques for bypassing C accelerators (specifically, using test.support.import_fresh_module with "_frozen_importlib" blocked).

That would address the immediate problem of module duplication, without misrepresenting what is going on in potentially confusing ways.

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 25, 2012, 5:38 AM

Post #39 of 73 (396 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Antoine Pitrou <pitrou [at] free> added the comment:

> Would be easier to tell distutils to install the extensions
> in a fixed name dir (instead of using a platform and version
> in the name) and then use that getpath.c. distutils is pretty
> flexible at that :-)

Look, this is becoming very off-topic and you aren't proposing anything
concrete (I see neither patches nor problems being solved). Could you
open another issue, if you care so much?

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 25, 2012, 5:46 AM

Post #40 of 73 (394 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Antoine Pitrou <pitrou [at] free> added the comment:

> Looking at Antoine's patch, I'd be happier with it if it *didn't*
> mutate the attributes of _frozen_importlib, but instead just added
> importlib._bootstrap as an alias for accessing it.

I thought it would be nicer for __file__, __name__ and __package__ to reflect the actual source code metadata (__file__ is always a py file while __cached__ may point to the compiled bytecode). But I don't have any strong feelings about that.

Yes, __file__ can end up misleading if you modify the Python source without recompiling, but I think most people would only read the code without modifying it.

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 25, 2012, 7:58 AM

Post #41 of 73 (395 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Brett Cannon <brett [at] python> added the comment:

To answer MAL's question about startup, I benchmarked on my machine using the normal_startup benchmark from hg.python.org/benchmarks and the bootstrap work only caused a 5-6% slowdown in a non-debug build. If you do it in a debug build it's much worse (I think it was 12% when I benchmarked).

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 25, 2012, 8:14 AM

Post #42 of 73 (400 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Nick Coghlan <ncoghlan [at] gmail> added the comment:

OK, I'm leaning back towards my original preference of getting _frozen_importlib out of the way as quickly as we can.

Specifically, I'm thinking of separating out the entry point used by importlib.__init__ from that used by pythonrun.c, such that the latter calls a "_bootstrap_from_frozen" function that returns a reference to "importlib._bootstrap", which pythonrun then places in the interpreter state.

There would be a few builtin modules that still end up with loaders from _frozen_importlib (specifically, those referenced from importlib._bootstrap._setup as well as importlib itself), but the vast majority of imported modules would only see the "real" versions from importlib._bootstrap.

Attached patch is an initial attempt (the reference counting on the two modules is likely still a bit dodgy - this is my first version that didn't segfault as I got used to the mechanics of dealing with a frozen module, so it errs on the side of leaking references)

----------
Added file: http://bugs.python.org/file25364/issue14657_bootstrap_from_disk.diff

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 25, 2012, 8:25 AM

Post #43 of 73 (394 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Antoine Pitrou <pitrou [at] free> added the comment:

> Attached patch is an initial attempt (the reference counting on the
> two modules is likely still a bit dodgy - this is my first version
> that didn't segfault as I got used to the mechanics of dealing with a
> frozen module, so it errs on the side of leaking references)

But does it make debugging any easier? The IO streams are not yet
initialized at that point (neither are the codecs), so you are executing
_bootstrap.py from a very bare interpreter.

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 25, 2012, 8:38 AM

Post #44 of 73 (397 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Nick Coghlan <ncoghlan [at] gmail> added the comment:

Yes, in that you'll be able to pick up changes in _bootstrap.py *without* having to rebuild Python.

With this in place, we could then get rid of the automatic regeneration of importlib.h which is a complete nightmare if you ever break your built interpreter while hacking on the bootstrapping (as I now know from experience).

With my approach, the experience is instead:

- modify _bootstrap.py, hack until any new tests pass
- run a new explicit "make freeze_importlib" command
- run "make"
- check everything still works
- commit and push

If you forget to run "make freeze_importlib", it doesn't really matter all that much, since the frozen one will only be used to find the real one, so it isn't a disaster if it's a little out of date. (That said, we should still have a test that at least checks the two modules have the same attributes)

It does mean that importlib.__init__ also needs to be able to run in a partially initialised interpreter, hence the switch from "import imp" to "import _imp".

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 25, 2012, 8:41 AM

Post #45 of 73 (397 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Nick Coghlan <ncoghlan [at] gmail> added the comment:

Actually, rather than a test in test suite, we would just change the current automatic rebuild to a Modules/Setup style "'Lib/importlib._bootstrap.py' is newer than 'Python/importlib.h', you may need to run 'make freeze_importlib'"

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 25, 2012, 8:52 AM

Post #46 of 73 (395 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Éric Araujo <merwok [at] netwok> added the comment:

> How do we currently tell that the interpreter is running in a checkout?
sysconfig.is_python_build()

Someone has to confirm that this works on Windows too, as I’ve been told that not installed vs. installed is less clear on that OS.

----------
nosy: +eric.araujo

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 25, 2012, 8:55 AM

Post #47 of 73 (397 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Antoine Pitrou <pitrou [at] free> added the comment:

> Actually, rather than a test in test suite, we would just change the
> current automatic rebuild to a Modules/Setup style
> "'Lib/importlib._bootstrap.py' is newer than 'Python/importlib.h', you
> may need to run 'make freeze_importlib'"

-1 from me. Nobody pays attention to this kind of warning.
(and the Modules/Setup thing is a nuisance)
Really, we must unsure that the frozen version of importlib is
up-to-date.
Also, normally you would write your tests in test_import, so that the
builtin import *is* tested. So you have to regenerate importlib before
committing (or you break the buildbots).

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 25, 2012, 8:58 AM

Post #48 of 73 (394 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Nick Coghlan <ncoghlan [at] gmail> added the comment:

The other advantage of splitting the entry points is that we can tweak Brett's plan to make the import machinery explicit such that it happens in a separate function that's only called from __init__.py.

That way the published hooks will always be from the on-disk implementation and never from the frozen one.

If you're after the ability to emit debugging messages in a way that doesn't cause fatal errors during system startup, the only way I can see is to have a "do nothing" module level display function in _bootstrap.py that is later replaced with a reference to builtins.print:

def _debug(*args, **kwds):
pass

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 25, 2012, 9:02 AM

Post #49 of 73 (396 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Nick Coghlan <ncoghlan [at] gmail> added the comment:

At the very least, failing to regenerate importlib.h shouldn't be a fatal build error. It should just run with what its got, and hopefully you will get a working interpreter out the other end, such that you can regenerate the frozen module on the next pass.

If we change that, then I'm OK with keeping the automatic rebuild.

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 25, 2012, 9:40 AM

Post #50 of 73 (395 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Brett Cannon <brett [at] python> added the comment:

So how would you tweak the explicit work I'm doing? The code is going to rely on sys.path_hooks and sys.meta_path being populated. I guess the frozen code can set up initially, and then importlib simply substitutes out classes from the frozen module to the code from the source version (which should be easy based on __class__ and __class__.__name__ or something). Or if you do this before anyone else (e.g. zipimport) gets to sys.path_hooks and sys.meta_path then you could just blow them away without care and simply set them up again.

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com

First page Previous page 1 2 3 Next page Last page  View All Python bugs RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.