Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Python: Dev

Using itertools in modules that are part of the build chain (Re: [Python-checkins] r76264 - python/branches/py3k/Lib/tokenize.py)

 

 

Python dev RSS feed   Index | Next | Previous | View Threaded


ncoghlan at gmail

Nov 14, 2009, 7:06 PM

Post #1 of 4 (672 views)
Permalink
Using itertools in modules that are part of the build chain (Re: [Python-checkins] r76264 - python/branches/py3k/Lib/tokenize.py)

benjamin.peterson wrote:
> Modified: python/branches/py3k/Lib/tokenize.py
> ==============================================================================
> --- python/branches/py3k/Lib/tokenize.py (original)
> +++ python/branches/py3k/Lib/tokenize.py Sat Nov 14 17:27:26 2009
> @@ -377,17 +377,12 @@
> The first token sequence will always be an ENCODING token
> which tells you which encoding was used to decode the bytes stream.
> """
> + # This import is here to avoid problems when the itertools module is not
> + # built yet and tokenize is imported.
> + from itertools import chain

This is probably a bad idea - calling tokenize.tokenize() from a thread
started as a side effect of importing a module will now deadlock on the
import lock if the module import waits for that thread to finish.

We tell people not to do that (starting and then waiting on threads as
part of module import) for exactly this reason, but it is also the
reason we avoid embedding import statements inside functions in the
standard library (not to mention encouraging third party developers to
also avoid embedding import statements inside functions).

This does constrain where we can use itertools - if we want carte
blanche to use it anywhere in the standard library, even those parts
that are imported as part of the build chain, we'll need to bite the
bullet and make it a builtin module rather than a separately built
extension module.

Cheers,
Nick.

P.S. The problem is easy to demonstrate on the current Py3k branch:

1. Put this in a module file in your py3k directory (e.g. "deadlock.py"):
-----------
import threading
import tokenize
f = open(__file__, 'rU')
def _deadlocks():
tokenize.tokenize(f.readline)
t = threading.Thread(target=_deadlocks)
t.start()
t.join()
-----------

2. Then run: ./python -c "import deadlock"

It will, as advertised, deadlock and you'll need to use Ctrl-Brk or kill
-9 to get rid of it. (Note that preventing this kind of thing is one of
the major reasons why direct execution and even the -m switch *don't*
hang onto the import lock while running the __main__ module)

--
Nick Coghlan | ncoghlan [at] gmail | Brisbane, Australia
---------------------------------------------------------------
_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


benjamin at python

Nov 14, 2009, 8:01 PM

Post #2 of 4 (622 views)
Permalink
Re: [Python-checkins] Using itertools in modules that are part of the build chain (Re: r76264 - python/branches/py3k/Lib/tokenize.py) [In reply to]

2009/11/14 Nick Coghlan <ncoghlan [at] gmail>:
> This does constrain where we can use itertools - if we want carte
> blanche to use it anywhere in the standard library, even those parts
> that are imported as part of the build chain, we'll need to bite the
> bullet and make it a builtin module rather than a separately built
> extension module.

I have another unpleasant but slightly less hacky solution. We put
detect_encoding in linecache where it is actually used.


--
Regards,
Benjamin
_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


brett at python

Nov 15, 2009, 12:40 PM

Post #3 of 4 (607 views)
Permalink
Re: [Python-checkins] Using itertools in modules that are part of the build chain (Re: r76264 - python/branches/py3k/Lib/tokenize.py) [In reply to]

On Sat, Nov 14, 2009 at 20:01, Benjamin Peterson <benjamin [at] python> wrote:
> 2009/11/14 Nick Coghlan <ncoghlan [at] gmail>:
>> This does constrain where we can use itertools - if we want carte
>> blanche to use it anywhere in the standard library, even those parts
>> that are imported as part of the build chain, we'll need to bite the
>> bullet and make it a builtin module rather than a separately built
>> extension module.
>
> I have another unpleasant but slightly less hacky solution. We put
> detect_encoding in linecache where it is actually used.

Well, it happens to be used by the standard library in linecache, but
not all external uses of it necessarily tie into linecache (e.g.
importlib uses detect_encoding() in some non-critical code). Might
just have to live with sub-optimal code.

-Brett
_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


benjamin at python

Nov 15, 2009, 12:43 PM

Post #4 of 4 (605 views)
Permalink
Re: [Python-checkins] Using itertools in modules that are part of the build chain (Re: r76264 - python/branches/py3k/Lib/tokenize.py) [In reply to]

2009/11/15 Brett Cannon <brett [at] python>:
> On Sat, Nov 14, 2009 at 20:01, Benjamin Peterson <benjamin [at] python> wrote:
>> 2009/11/14 Nick Coghlan <ncoghlan [at] gmail>:
>>> This does constrain where we can use itertools - if we want carte
>>> blanche to use it anywhere in the standard library, even those parts
>>> that are imported as part of the build chain, we'll need to bite the
>>> bullet and make it a builtin module rather than a separately built
>>> extension module.
>>
>> I have another unpleasant but slightly less hacky solution. We put
>> detect_encoding in linecache where it is actually used.
>
> Well, it happens to be used by the standard library in linecache, but
> not all external uses of it necessarily tie into linecache (e.g.
> importlib uses detect_encoding() in some non-critical code). Might
> just have to live with sub-optimal code.

Well, what I mean is that we'd do:

def _detect_encoding():

in linecache and then "from linecache import _detect_encoding as
detect_encoding" in tokenize.py.



--
Regards,
Benjamin
_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com

Python dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.