Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Python: Python

List comprehension/genexp inconsistency.

 

 

Python python RSS feed   Index | Next | Previous | View Threaded


jcd at sdf

Mar 20, 2012, 1:23 PM

Post #1 of 4 (287 views)
Permalink
List comprehension/genexp inconsistency.

One of my coworkers just stumbled across an interesting issue. I'm
hoping someone here can explain why it's happening.

When trying to create a class with a dual-loop generator expression in a
class definition, there is a strange scoping issue where the inner
variable is not found, (but the outer loop variable is found), while a
list comprehension has no problem finding both variables.

Demonstration:

>>> class Spam:
... foo, bar = 4, 4
... baz = dict(((x, y), x+y) for x in range(foo) for y in
range(bar))
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in Spam
File "<stdin>", line 3, in <genexpr>
NameError: global name 'bar' is not defined
>>> class Eggs(object):
... foo, bar = 4, 4
... baz = dict([.((x, y), x+y) for x in range(foo) for y in
range(bar)])
...
>>>

This was discovered in python 2.6. In python 3.2, both versions fail
with the same NameError.

Obviously, this is easy enough to work around. I'm curious though:
What's going on under the hood to cause the nested generator expression
to fail while the list comprehension succeeds?

Cheers,
Cliff


--
http://mail.python.org/mailman/listinfo/python-list


ian.g.kelly at gmail

Mar 20, 2012, 3:50 PM

Post #2 of 4 (278 views)
Permalink
Re: List comprehension/genexp inconsistency. [In reply to]

On Tue, Mar 20, 2012 at 3:16 PM, Dennis Lee Bieber
<wlfraed [at] ix> wrote:
> On Tue, 20 Mar 2012 16:23:22 -0400, "J. Cliff Dyer"
> <jcd [at] sdf> declaimed the following in
> gmane.comp.python.general:
>
>>
>> When trying to create a class with a dual-loop generator expression in a
>> class definition, there is a strange scoping issue where the inner
>> variable is not found, (but the outer loop variable is found), while a
>> list comprehension has no problem finding both variables.
>>
>        Read http://www.python.org/dev/peps/pep-0289/ -- in particular, look
> for the word "leak"

No, this has nothing to do with the loop variable leaking. It appears
to have to do with the fact that the variables and the generator
expression are inside a class block. I think that it's related to the
reason that this doesn't work:

class Foo(object):
x = 42
def foo():
print(x)
foo()

In this case, x is not a local variable of foo, nor is it a global.
In order for foo to access x, it would have to be a closure -- but
Python can't make it a closure in this case, because the variable it
accesses is (or rather, will become) a class attribute, not a local
variable of a function that can be stored in a cell. Instead, the
compiler just makes it a global reference in the hope that such a
global will actually be defined when the code is run.

For that reason, what surprises me about Cliff's example is that a
generator expression works at all in that context. It seems to work
as long as it contains only one loop, but not if it contains two. To
find out why, I tried disassembling one:

>>> class Foo(object):
... x = 42
... y = 12
... g = (a+b for a in range(x) for b in range(y))
...
>>> dis.dis(Foo.g.gi_code)
4 0 LOAD_FAST 0 (.0)
>> 3 FOR_ITER 34 (to 40)
6 STORE_FAST 1 (a)
9 LOAD_GLOBAL 0 (range)
12 LOAD_GLOBAL 1 (y)
15 CALL_FUNCTION 1
18 GET_ITER
>> 19 FOR_ITER 15 (to 37)
22 STORE_FAST 2 (b)
25 LOAD_FAST 1 (a)
28 LOAD_FAST 2 (b)
31 BINARY_ADD
32 YIELD_VALUE
33 POP_TOP
34 JUMP_ABSOLUTE 19
>> 37 JUMP_ABSOLUTE 3
>> 40 LOAD_CONST 0 (None)
43 RETURN_VALUE

So that explains it. Notice that "x" is never actually accessed in
that disassembly; only "y" is. It turns out that the first iterator
[range(x)] is actually created before the generator ever starts
executing, and is stored as an anonymous local variable on the
generator's stack frame -- so it's created in the class scope, not in
the generator scope. The second iterator, however, is recreated on
every iteration of the first iterator, so it can't be pre-built in
that manner. It does get created in the generator scope, and when
that happens it blows up because it can't find the variable, just like
the function example above.

Cheers,
Ian
--
http://mail.python.org/mailman/listinfo/python-list


showell30 at yahoo

Mar 20, 2012, 8:07 PM

Post #3 of 4 (281 views)
Permalink
Re: List comprehension/genexp inconsistency. [In reply to]

On Mar 20, 3:50 pm, Ian Kelly <ian.g.ke...@gmail.com> wrote:
> On Tue, Mar 20, 2012 at 3:16 PM, Dennis Lee Bieber
>
> <wlfr...@ix.netcom.com> wrote:
> > On Tue, 20 Mar 2012 16:23:22 -0400, "J. Cliff Dyer"
> > <j...@sdf.lonestar.org> declaimed the following in
> > gmane.comp.python.general:
>
> >> When trying to create a class with a dual-loop generator expression in a
> >> class definition, there is a strange scoping issue where the inner
> >> variable is not found, (but the outer loop variable is found), while a
> >> list comprehension has no problem finding both variables.
>
> >        Readhttp://www.python.org/dev/peps/pep-0289/-- in particular, look
> > for the word "leak"
>
> No, this has nothing to do with the loop variable leaking.  It appears
> to have to do with the fact that the variables and the generator
> expression are inside a class block.

Interesting.

Just for completeness, the code does seem to work fine when you take
it out of the class:

Python 2.6.1 (r261:67515, Jun 24 2010, 21:47:49)
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type "help", "copyright", "credits" or "license" for more
information.
>>> foo, bar = 4, 4
>>> g = (((x, y), x+y) for x in range(foo) for y in range(bar))
>>> dict(g)
{(0, 1): 1, (1, 2): 3, (3, 2): 5, (0, 0): 0, (3, 3): 6, (3, 0): 3,
(3, 1): 4, (2, 1): 3, (0, 2): 2, (2, 0): 2, (1, 3): 4, (2, 3): 5, (2,
2): 4, (1, 0): 1, (0, 3): 3, (1, 1): 2}
>>> import dis
>>> dis.dis(g.gi_code)
1 0 SETUP_LOOP 57 (to 60)
3 LOAD_FAST 0 (.0)
>> 6 FOR_ITER 50 (to 59)
9 STORE_FAST 1 (x)
12 SETUP_LOOP 41 (to 56)
15 LOAD_GLOBAL 0 (range)
18 LOAD_GLOBAL 1 (bar)
21 CALL_FUNCTION 1
24 GET_ITER
>> 25 FOR_ITER 27 (to 55)
28 STORE_FAST 2 (y)
31 LOAD_FAST 1 (x)
34 LOAD_FAST 2 (y)
37 BUILD_TUPLE 2
40 LOAD_FAST 1 (x)
43 LOAD_FAST 2 (y)
46 BINARY_ADD
47 BUILD_TUPLE 2
50 YIELD_VALUE
51 POP_TOP
52 JUMP_ABSOLUTE 25
>> 55 POP_BLOCK
>> 56 JUMP_ABSOLUTE 6
>> 59 POP_BLOCK
>> 60 LOAD_CONST 0 (None)
63 RETURN_VALUE
--
http://mail.python.org/mailman/listinfo/python-list


jcd at sdf

Mar 21, 2012, 6:50 AM

Post #4 of 4 (275 views)
Permalink
Re: List comprehension/genexp inconsistency. [In reply to]

Thanks, Ian.

That does seem to explain it. The inner loop doesn't have access to the
class's name space, and of course you can't fix it by referencing Foo.y
explicitly, because the class isn't fully defined yet.

Ultimately, we realized that the dict should be created in the __init__
method, so that it gets the appropriate values of the foo and bar
attributes if the class is subclassed, which obviates the problem, but
it's a fascinating peek into python internals.

It looks like this is explained in the section of the pep entitled
"Early Binding versus Late Binding"
http://www.python.org/dev/peps/pep-0289/#early-binding-versus-late-binding


Cheers,
Cliff



On Tue, 2012-03-20 at 16:50 -0600, Ian Kelly wrote:
> On Tue, Mar 20, 2012 at 3:16 PM, Dennis Lee Bieber
> <wlfraed [at] ix> wrote:
> > On Tue, 20 Mar 2012 16:23:22 -0400, "J. Cliff Dyer"
> > <jcd [at] sdf> declaimed the following in
> > gmane.comp.python.general:
> >
> >>
> >> When trying to create a class with a dual-loop generator expression in a
> >> class definition, there is a strange scoping issue where the inner
> >> variable is not found, (but the outer loop variable is found), while a
> >> list comprehension has no problem finding both variables.
> >>
> > Read http://www.python.org/dev/peps/pep-0289/ -- in particular, look
> > for the word "leak"
>
> No, this has nothing to do with the loop variable leaking. It appears
> to have to do with the fact that the variables and the generator
> expression are inside a class block. I think that it's related to the
> reason that this doesn't work:
>
> class Foo(object):
> x = 42
> def foo():
> print(x)
> foo()
>
> In this case, x is not a local variable of foo, nor is it a global.
> In order for foo to access x, it would have to be a closure -- but
> Python can't make it a closure in this case, because the variable it
> accesses is (or rather, will become) a class attribute, not a local
> variable of a function that can be stored in a cell. Instead, the
> compiler just makes it a global reference in the hope that such a
> global will actually be defined when the code is run.
>
> For that reason, what surprises me about Cliff's example is that a
> generator expression works at all in that context. It seems to work
> as long as it contains only one loop, but not if it contains two. To
> find out why, I tried disassembling one:
>
> >>> class Foo(object):
> ... x = 42
> ... y = 12
> ... g = (a+b for a in range(x) for b in range(y))
> ...
> >>> dis.dis(Foo.g.gi_code)
> 4 0 LOAD_FAST 0 (.0)
> >> 3 FOR_ITER 34 (to 40)
> 6 STORE_FAST 1 (a)
> 9 LOAD_GLOBAL 0 (range)
> 12 LOAD_GLOBAL 1 (y)
> 15 CALL_FUNCTION 1
> 18 GET_ITER
> >> 19 FOR_ITER 15 (to 37)
> 22 STORE_FAST 2 (b)
> 25 LOAD_FAST 1 (a)
> 28 LOAD_FAST 2 (b)
> 31 BINARY_ADD
> 32 YIELD_VALUE
> 33 POP_TOP
> 34 JUMP_ABSOLUTE 19
> >> 37 JUMP_ABSOLUTE 3
> >> 40 LOAD_CONST 0 (None)
> 43 RETURN_VALUE
>
> So that explains it. Notice that "x" is never actually accessed in
> that disassembly; only "y" is. It turns out that the first iterator
> [range(x)] is actually created before the generator ever starts
> executing, and is stored as an anonymous local variable on the
> generator's stack frame -- so it's created in the class scope, not in
> the generator scope. The second iterator, however, is recreated on
> every iteration of the first iterator, so it can't be pre-built in
> that manner. It does get created in the generator scope, and when
> that happens it blows up because it can't find the variable, just like
> the function example above.
>
> Cheers,
> Ian


--
http://mail.python.org/mailman/listinfo/python-list

Python python RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.