Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Python: Bugs

[issue14657] Avoid two importlib copies

 

 

First page Previous page 1 2 3 Next page Last page  View All Python bugs RSS feed   Index | Next | Previous | View Threaded


report at bugs

Apr 23, 2012, 3:34 PM

Post #1 of 73 (375 views)
Permalink
[issue14657] Avoid two importlib copies

New submission from Antoine Pitrou <pitrou [at] free>:

This patch avoids creating a second copy of importlib._bootstrap when a first one exists as _frozen_importlib.
This isn't perfect as it mutates the module when importlib is imported for the first time, but I think it's better than the status quo.
Also, importlib itself could be imported somewhere along the startup phase, so that all this is invisible to the user.

I'm not sure how to test this, since _frozen_importlib is an implementation detail, and changing that module's name would probably defeat the test already.

----------
components: Interpreter Core, Library (Lib)
files: unique_importlib.patch
keywords: patch
messages: 159096
nosy: brett.cannon, ncoghlan, pitrou
priority: normal
severity: normal
stage: patch review
status: open
title: Avoid two importlib copies
type: behavior
versions: Python 3.3
Added file: http://bugs.python.org/file25328/unique_importlib.patch

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 23, 2012, 3:42 PM

Post #2 of 73 (357 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Changes by Antoine Pitrou <pitrou [at] free>:


Removed file: http://bugs.python.org/file25328/unique_importlib.patch

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 23, 2012, 3:42 PM

Post #3 of 73 (358 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Changes by Eric V. Smith <eric [at] trueblade>:


----------
nosy: +eric.smith

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 23, 2012, 3:43 PM

Post #4 of 73 (356 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Changes by Antoine Pitrou <pitrou [at] free>:


Added file: http://bugs.python.org/file25329/unique_importlib.patch

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 23, 2012, 3:50 PM

Post #5 of 73 (357 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Antoine Pitrou <pitrou [at] free> added the comment:

New patch with tests.

----------
Added file: http://bugs.python.org/file25330/unique_importlib2.patch

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 23, 2012, 4:01 PM

Post #6 of 73 (357 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Antoine Pitrou <pitrou [at] free> added the comment:

New patch also avoids calling _setup() a second time (which can be annoying since _setup() has a list.append() call somewhere).

----------
Added file: http://bugs.python.org/file25331/unique_importlib3.patch

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 23, 2012, 5:09 PM

Post #7 of 73 (349 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Changes by Eric Snow <ericsnowcurrently [at] gmail>:


----------
nosy: +eric.snow

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 23, 2012, 5:41 PM

Post #8 of 73 (351 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Brett Cannon <brett [at] python> added the comment:

So why the mutation? Are you that worried someone is going to import importlib._bootstrap directly?

This also costs in development complexity because not only do you have to run 'make' to get changes to be testable, but it also leads to difficult debugging situations where if you are not totally sure you got something working you won't find out until you see e.g. that the standard I/O streams were not initialized.

If you really feel the need to hide _frozen_importlib then it would be better to do the minimum required to get import up and running (should be once the encodings are up in Py_Initialize) and then pull in importlib._bootstrap and have that clear out what _frozen_importlib set like __import__(), sys.path_importer_cache(), and eventually sys.meta_path and sys.path_hooks (I wouldn't touch sys.modules, though, thanks to built-ins and extensions not liking to be reloaded).

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 23, 2012, 5:56 PM

Post #9 of 73 (350 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Brett Cannon <brett [at] python> added the comment:

I should also mention that all of this becomes much less important once issue #14605 is finished because at that point sys.meta_path and sys.path_hooks will have _frozen_importlib objects and that will be what importlib works off of directly. But I still understand the desire to eliminate _frozen_importlib from being exposed, it's just a matter of coming up with a reasonable solution.

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 23, 2012, 9:40 PM

Post #10 of 73 (346 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Nick Coghlan <ncoghlan [at] gmail> added the comment:

My preference would also be for _frozen_importlib._bootstrap to overwrite as much evidence of itself as it can with the "real" one.

This would also mean that changes to importlib._bootstrap would actually take effect for user code almost immediately, *without* rebuilding Python, as the frozen version would *only* be used to get hold of the pure Python version.

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 23, 2012, 9:48 PM

Post #11 of 73 (345 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Changes by Arfrever Frehtes Taifersar Arahesis <Arfrever.FTA [at] GMail>:


----------
nosy: +Arfrever

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 24, 2012, 1:10 AM

Post #12 of 73 (346 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Antoine Pitrou <pitrou [at] free> added the comment:

> So why the mutation? Are you that worried someone is going to import
> importlib._bootstrap directly?

Well, importing importlib *does* import importlib._bootstrap, and
creates another copy of the module. importlib.__import__ is then wired
to _bootstrap.__import__, which is different from the built-in
__import__ (potentially using different globals, for example).

> This also costs in development complexity because not only do you have
> to run 'make' to get changes to be testable, but it also leads to
> difficult debugging situations where if you are not totally sure you
> got something working you won't find out until you see e.g. that the
> standard I/O streams were not initialized.

I'm worried that two different copies of importlib will lead to its own
difficult debugging situations.

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 24, 2012, 2:06 AM

Post #13 of 73 (345 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Antoine Pitrou <pitrou [at] free> added the comment:

> This would also mean that changes to importlib._bootstrap would
> actually take effect for user code almost immediately, *without*
> rebuilding Python, as the frozen version would *only* be used to get
> hold of the pure Python version.

Actually, _io, encodings and friends must be loaded before importlib
gets imported from Python code, so you will still have __loader__
entries referencing the frozen importlib, unless you also rewrite these
attributes.

My desire here is not to hide _frozen_importlib, rather to avoid subtle
issues with two instances of a module living in memory with separate
global states. Whether it's the frozen version or the on-disk Python
version that gets the preference is another question (a less important
one in my mind).

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 24, 2012, 2:14 AM

Post #14 of 73 (349 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Marc-Andre Lemburg <mal [at] egenix> added the comment:

Antoine Pitrou wrote:
>
> Antoine Pitrou <pitrou [at] free> added the comment:
>
>> This would also mean that changes to importlib._bootstrap would
>> actually take effect for user code almost immediately, *without*
>> rebuilding Python, as the frozen version would *only* be used to get
>> hold of the pure Python version.
>
> Actually, _io, encodings and friends must be loaded before importlib
> gets imported from Python code, so you will still have __loader__
> entries referencing the frozen importlib, unless you also rewrite these
> attributes.
>
> My desire here is not to hide _frozen_importlib, rather to avoid subtle
> issues with two instances of a module living in memory with separate
> global states. Whether it's the frozen version or the on-disk Python
> version that gets the preference is another question (a less important
> one in my mind).

Why don't you freeze the whole importlib package to avoid all these
issues ? As side effect, it will also load a little faster.

----------
nosy: +lemburg

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 24, 2012, 8:10 AM

Post #15 of 73 (344 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Brett Cannon <brett [at] python> added the comment:

To start, I'm *not* going to make the final call on this issue's solution. I'm inches away from importlib burnout and general integration frustration with trying to clean up the implicit behaviour. So to prevent me from making a bad decision I will you guys make the final call.

Anyway, I see two options here. One is the "let _frozen_importlib be *the* implementation, period" argument put forth by Antoine (MAL's "freeze everything" also falls under this since it suffers from the same issues). This is the easiest solution for the issue of not having overlapping implementations and cause potential mix-ups, etc. The issue becomes development difficulty goes up as now you are adding a compile step where if you screw up you can get really bad error messages (e.g. "standard streams could not be created" kind of stuff). This could theoretically be overcome if the importlib tests all used a manually created module directly from the source code to verify things before rebuilding (as well as making sure sys.path_importer_cache was cleaned out). With a restructuring of importlib's tests to use a common TestCase with the proper setUp()/teardown() for keeping things clean along with class and module fixtures to prevent obscene stuff like re-importing for every test metho
d. Another option is we hide the source as _importlib or something to allow direct importation w/o any tricks under a protected name.

Then there is Nick's proposal of using _frozen_importlib to start up and then swap out with a new version created from the source during startup. This keeps development simple since the tests run against the code *almost* all other code will use and thus eliminate the test. The problem here is that startup is a smidgen slower and it requires you blacklist what needs to get swapped out and if you mess up that will be tough to debug as well.

Both get the same outcome but with different approaches, it's just a question of which one is easiest to maintain.

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 24, 2012, 8:16 AM

Post #16 of 73 (343 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Antoine Pitrou <pitrou [at] free> added the comment:

Le mardi 24 avril 2012 à 15:10 +0000, Brett Cannon a écrit :
> Both get the same outcome but with different approaches, it's just a
> question of which one is easiest to maintain.

I don't have any strong preference. Nick's proposal sounds slightly
better but Nick hasn't uploaded a patch yet :-)

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 24, 2012, 10:37 AM

Post #17 of 73 (356 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Marc-Andre Lemburg <mal [at] egenix> added the comment:

test me
thod. Another option is we hide the source as _importlib or something to allow direct importation w/o any tricks under a protected name.

Using the freeze everything approach you make things easier for the
implementation, since you don't have to think about whether certain
pieces of code are already available or not.

For development, you can also have the package load bytecode
or source from an external package instead of running (all of)
the module's bytecode that was compiled into the binary.

This is fairly easy to do, since the needed exec() does not
depend on the import machinery.

The only downside is big if statement to isolate the frozen
version from the loaded one - would be great if we had a
command to stop module execution or code execution for a block to
make that more elegant, e.g. "break" at module scope :-)

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 24, 2012, 11:40 AM

Post #18 of 73 (342 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Eric Snow <ericsnowcurrently [at] gmail> added the comment:

> would be great if we had a
> command to stop module execution or code execution for a block to
> make that more elegant, e.g. "break" at module scope :-)

I floated that proposal on python-list a while back and the reaction was mixed. [1] Maybe it's time to try again. (moving over to python-ideas...)

[1] http://mail.python.org/pipermail/python-list/2011-June/1274424.html

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 24, 2012, 12:23 PM

Post #19 of 73 (344 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Brett Cannon <brett [at] python> added the comment:

I don't quite follow what you are suggesting, MAL. Are you saying to freeze importlib.__init__ and importlib._bootstrap and somehow have improtlib.__init__ choose what to load, frozen or source?

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 24, 2012, 12:51 PM

Post #20 of 73 (343 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Marc-Andre Lemburg <mal [at] egenix> added the comment:

Brett Cannon wrote:
>
> Brett Cannon <brett [at] python> added the comment:
>
> I don't quite follow what you are suggesting, MAL. Are you saying to freeze importlib.__init__ and importlib._bootstrap and somehow have improtlib.__init__ choose what to load, frozen or source?

No, it always loads and runs the frozen code, but at the start of
the module code it branches between the frozen bytecode and the code
read from an external file.

Pseudo-code in every module you wish to be able to host externally:

#
# MyModule
#
if operating_in_dev_mode and '<frozen>' in __file__:
exec(open('dev-area/MyModule.py', 'r).read(), globals(), globals())
else:
# Normal module code
class MyClass: ...
# hundreds of lines of code...

Aside: With a module scope "break", the code would look more elegant:

#
# MyModule
#
if operating_in_dev_mode and '<frozen>' in __file__:
exec(open('dev-area/MyModule.py', 'r).read(), globals(), globals())
break

# Normal module code
class MyClass: ...
# hundreds of lines of code...

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 24, 2012, 1:25 PM

Post #21 of 73 (344 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Brett Cannon <brett [at] python> added the comment:

So basically if you are running in a checkout, grab the source file and compile it manually since its location is essentially hard-coded and thus you don't need to care about sys.path and all the other stuff required to do an import, while using the frozen code for when you are running an installed module since you would otherwise need to do the search for importlib's source file to do a load at startup properly.

That's an interesting idea. How do we currently tell that the interpreter is running in a checkout? Is that exposed in any way to Python code?

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 24, 2012, 1:39 PM

Post #22 of 73 (341 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Antoine Pitrou <pitrou [at] free> added the comment:

> That's an interesting idea. How do we currently tell that the
> interpreter is running in a checkout? Is that exposed in any way to
> Python code?

Look for _BUILDDIR_COOKIE in setup.py. But that's only for non-Windows
platforms (I don't think setup.py is invoked under Windows).

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 24, 2012, 1:42 PM

Post #23 of 73 (344 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Marc-Andre Lemburg <mal [at] egenix> added the comment:

Brett Cannon wrote:
>
> Brett Cannon <brett [at] python> added the comment:
>
> So basically if you are running in a checkout, grab the source file and compile it manually since its location is essentially hard-coded and thus you don't need to care about sys.path and all the other stuff required to do an import, while using the frozen code for when you are running an installed module since you would otherwise need to do the search for importlib's source file to do a load at startup properly.

Right.

> That's an interesting idea. How do we currently tell that the interpreter is running in a checkout? Is that exposed in any way to Python code?

There's some magic happening in site.py for checkouts, but I'm not sure
whether any of that is persistent or even available at the time these
particular imports would happen.

Then again, I'm not sure you need to know whether you have a checkout
or not. You just need some flag to identify whether you want the
search for external module code to take place or not. sys.flags
could be used for that.

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 24, 2012, 2:13 PM

Post #24 of 73 (343 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Brett Cannon <brett [at] python> added the comment:

Modules/getpath.c seems to be where the C code does it when getting paths for sys.path. So it would be possible to use that same algorithm to set some sys attribute (e.g. in_checkout or something) much like sys.gettotalrefcount is optional and only shown when built with --with-pydebug. Otherwise some directory structure check could be done (e.g. find importlib/_bootstrap.py off of sys.path, and then see if ../Modules/Setup or something also exists that would never show up in an installed CPython).

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Apr 24, 2012, 2:15 PM

Post #25 of 73 (357 views)
Permalink
[issue14657] Avoid two importlib copies [In reply to]

Antoine Pitrou <pitrou [at] free> added the comment:

> Otherwise some directory structure check could be done (e.g. find
> importlib/_bootstrap.py off of sys.path, and then see
> if ../Modules/Setup or something also exists that would never show up
> in an installed CPython).

Well, the directory structure check *is* pybuilddir.txt (under POSIX,
again; under Windows, you might want to check for Lib/importlib
directly).
But, agreed, this could be factored in a sys._private_something
attribute.

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue14657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com

First page Previous page 1 2 3 Next page Last page  View All Python bugs RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.