Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Python: Python

CPython 2.7: Weakset data changing size during internal iteration

 

 

Python python RSS feed   Index | Next | Previous | View Threaded


lamialily at cleverpun

Jun 1, 2012, 8:23 AM

Post #1 of 8 (199 views)
Permalink
CPython 2.7: Weakset data changing size during internal iteration

I've got a bit of a problem - my project uses weak sets in multiple
areas, the problem case in particular being to indicate what objects
are using a particular texture, if any, so that its priority in OpenGL
can be adjusted to match at the same time as it being (de)referenced
by any explicit calls.

Problem is that for certain high-frequency operations, it seems
there's too much data going in and out for it to handle - the
following traceback is given to me (project path changed to protect
the innocent):

Traceback (most recent call last):
File "C:\foo\bar\game.py", line 279, in update
self.player.update()
File "C:\foo\bar\player.py", line 87, in update
PlayerBullet((self.x + 8, self.y + 9), 0, self.parent)
File "C:\foo\bar\player.py", line 96, in __init__
self.sprite = video.Sprite("testbullet", 0)
File "C:\foo\bar\video.py", line 95, in __init__
self.opengl_id = reference_texture(self, target)
File "C:\foo\bar\video.py", line 310, in reference_texture
if not video_handler.textures[target].references:
File "C:\Python27\lib\_weakrefset.py", line 66, in __len__
return sum(x() is not None for x in self.data)
File "C:\Python27\lib\_weakrefset.py", line 66, in <genexpr>
return sum(x() is not None for x in self.data)
RuntimeError: Set changed size during iteration

I can post the sources relevant to the traceback upon request, but
hopefully a traceback is sufficient as the most immediate problem is
in Python's libraries.

Any suggestions on what to do about this? I can't exactly throw a
.copy() in on top of the data iteration and call it good since it's
part of the standard Python library.

~Temia
--
When on earth, do as the earthlings do.
--
http://mail.python.org/mailman/listinfo/python-list


tjreedy at udel

Jun 1, 2012, 3:42 PM

Post #2 of 8 (191 views)
Permalink
Re: CPython 2.7: Weakset data changing size during internal iteration [In reply to]

On 6/1/2012 11:23 AM, Temia Eszteri wrote:
> I've got a bit of a problem - my project uses weak sets in multiple
> areas, the problem case in particular being to indicate what objects
> are using a particular texture, if any, so that its priority in OpenGL
> can be adjusted to match at the same time as it being (de)referenced
> by any explicit calls.
>
> Problem is that for certain high-frequency operations, it seems
> there's too much data going in and out for it to handle - the
> following traceback is given to me (project path changed to protect
> the innocent):
>
> Traceback (most recent call last):
> File "C:\foo\bar\game.py", line 279, in update
> self.player.update()
> File "C:\foo\bar\player.py", line 87, in update
> PlayerBullet((self.x + 8, self.y + 9), 0, self.parent)
> File "C:\foo\bar\player.py", line 96, in __init__
> self.sprite = video.Sprite("testbullet", 0)
> File "C:\foo\bar\video.py", line 95, in __init__
> self.opengl_id = reference_texture(self, target)
> File "C:\foo\bar\video.py", line 310, in reference_texture
> if not video_handler.textures[target].references:

I gather that the .references attribute is sometimes/always a weakset.
To determine its boolean value, it computes its length. For regular
sets, this is sensible as .__len__() returns a pre-computed value.

> File "C:\Python27\lib\_weakrefset.py", line 66, in __len__
> return sum(x() is not None for x in self.data)

Given that len(weakset) is defined (sensibly) as the number of currently
active members, it must count. weakset should really have .__bool__
method that uses any() instead of sum(). That might reduce, but not
necessarily eliminate your problem.

> File "C:\Python27\lib\_weakrefset.py", line 66, in<genexpr>
> return sum(x() is not None for x in self.data)
> RuntimeError: Set changed size during iteration

I can think of two reasons:

1. You are using multiple threads and another thread does something to
change the size of the set during the iteration. Solution? put a lock
around the if-statement so no other thread can change self.data during
the iteration.

2. Weakset members remove themselves from the set before returning None.
(Just a thought, in case you are not using threads).

--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list


lamialily at cleverpun

Jun 1, 2012, 4:40 PM

Post #3 of 8 (190 views)
Permalink
Re: CPython 2.7: Weakset data changing size during internal iteration [In reply to]

On Fri, 01 Jun 2012 18:42:22 -0400, Terry Reedy <tjreedy [at] udel>
wrote:

>I gather that the .references attribute is sometimes/always a weakset.
>To determine its boolean value, it computes its length. For regular
>sets, this is sensible as .__len__() returns a pre-computed value.

Indeed. Back when I was using 2.6 to develop, it was simply an integer
counter, but that led to some difficulties in maintaining it in case
some sprite objects hadn't been explicitly killed.

>Given that len(weakset) is defined (sensibly) as the number of currently
>active members, it must count. weakset should really have .__bool__
>method that uses any() instead of sum(). That might reduce, but not
>necessarily eliminate your problem.

Think it might be worth looking into submitting a patch for the next
minor releases for Python if it turns out to solve the problem?
Failing that, I might just have to check the truth value of the data
attribute inside the weak set manually...

>I can think of two reasons:
>
>1. You are using multiple threads and another thread does something to
>change the size of the set during the iteration. Solution? put a lock
>around the if-statement so no other thread can change self.data during
>the iteration.
>
>2. Weakset members remove themselves from the set before returning None.
>(Just a thought, in case you are not using threads).

It's a multithreaded program to a small extent - I offload I/O
operations, music handling, and a basic, optional debugger console
(which I really wish I could set up to use the real interactive
interpreter instead of the shoddy setup I've got now) to seperate
threads, while the main logic operates in one thread due to OpenGL's
issues with multiple Python threads.

Since the sprite object calls to reference a texture in __init__(),
that means no other thread could even safely reference the texture due
to the potential of making OpenGL calls without the relevant context
kept by the main thread (this has made the loading thread kind of
useless, but the texture strings themselves can still be loaded into
temporary memory, and other data like music still works).

If the weak references removing themselves is the case, it seems like
a kind of silly problem - one would imagine they'd wrap the data check
in _IterationGuard in the _weakrefset.py file like they do for calls
to __iter__(). Very strange.

Anyway, I truly appreciate your input and suggestions. I'll see if
they have any results, and if so, we can work out submitting a patch.
If not, at least reading through this gave me the idea to just call
the data set inside it, so I can use it as an imperfect but functional
solution within the scope of my project.

~Temia
--
When on earth, do as the earthlings do.
--
http://mail.python.org/mailman/listinfo/python-list


steve+comp.lang.python at pearwood

Jun 1, 2012, 8:05 PM

Post #4 of 8 (190 views)
Permalink
Re: CPython 2.7: Weakset data changing size during internal iteration [In reply to]

On Fri, 01 Jun 2012 08:23:44 -0700, Temia Eszteri wrote:

> I've got a bit of a problem - my project uses weak sets in multiple
> areas, the problem case in particular being to indicate what objects are
> using a particular texture, if any, so that its priority in OpenGL can
> be adjusted to match at the same time as it being (de)referenced by any
> explicit calls.
>
> Problem is that for certain high-frequency operations, it seems there's
> too much data going in and out for it to handle

I doubt that very much. If you are using threads, it is more likely your
code has a race condition where you are modifying a weak set at the same
time another thread is trying to iterate over it (in this case, to
determine it's length), and because it's a race condition, it only
happens when conditions are *just right*. Since race conditions hitting
are usually rare, you only notice it when there's a lot of data.



--
Steven
--
http://mail.python.org/mailman/listinfo/python-list


lamialily at cleverpun

Jun 1, 2012, 8:24 PM

Post #5 of 8 (190 views)
Permalink
Re: CPython 2.7: Weakset data changing size during internal iteration [In reply to]

On 02 Jun 2012 03:05:01 GMT, Steven D'Aprano
<steve+comp.lang.python [at] pearwood> wrote:

>I doubt that very much. If you are using threads, it is more likely your
>code has a race condition where you are modifying a weak set at the same
>time another thread is trying to iterate over it (in this case, to
>determine it's length), and because it's a race condition, it only
>happens when conditions are *just right*. Since race conditions hitting
>are usually rare, you only notice it when there's a lot of data.

Except that the few threads I use don't modify that data at all
because the functions that even touch the references set rely on
OpenGL contexts along with it which are thread-bound, ergo, impossible
to call without stopping the code in its tracks to begin with unless
the context's explicitly shifted (which it very much isn't).

And I've done some looking through the weak set's code in the
intervening time; it does easily have the potential to cause this kind
of problem because the weak references made are set to a callback to
remove them from the data set when garbage is collected. See for
yourself.:

Lines 81-84, _weakrefset.py:

def add(self, item):
if self._pending_removals:
self._commit_removals()
self.data.add(ref(item, self._remove)) <--

Lines 38-44, likewise: (for some reason called in __init__ rather than
at the class level, but likely to deal with a memory management issue)

def _remove(item, selfref=ref(self)):
self = selfref()
if self is not None:
if self._iterating: <--
self._pending_removals.append(item)
else:
self.data.discard(item) <--
self._remove = _remove

The thing is, as Terry pointed out, its truth value is tested based on
__len__(), which as shown does NOT set the _iterating protection:

def __len__(self):
return sum(x() is not None for x in self.data)

Don't be so fast to dismiss things when the situation would not have
made a race condition possible to begin with.

~Temia
--
When on earth, do as the earthlings do.
--
http://mail.python.org/mailman/listinfo/python-list


tjreedy at udel

Jun 1, 2012, 11:27 PM

Post #6 of 8 (191 views)
Permalink
Re: CPython 2.7: Weakset data changing size during internal iteration [In reply to]

On 6/1/2012 7:40 PM, Temia Eszteri wrote:

>> Given that len(weakset) is defined (sensibly) as the number of currently
>> active members, it must count. weakset should really have .__bool__
>> method that uses any() instead of sum(). That might reduce, but not
>> necessarily eliminate your problem.
>
> Think it might be worth looking into submitting a patch for the next
> minor releases for Python if it turns out to solve the problem?

I think a patch would be worthwhile even if this is not the source of
your problem. If bool is defined as 'if any ...', that should be the code.

>> I can think of two reasons:
>>
>> 1. You are using multiple threads and another thread does something to
>> change the size of the set during the iteration. Solution? put a lock
>> around the if-statement so no other thread can change self.data during
>> the iteration.
>>
>> 2. Weakset members remove themselves from the set before returning None.
>> (Just a thought, in case you are not using threads).

In other words, it is possible that weakset.__len__ is buggy. Since you
are sure that 1) is not your problem, that seems more likely now.

> If the weak references removing themselves is the case, it seems like
> a kind of silly problem - one would imagine they'd wrap the data check
> in _IterationGuard in the _weakrefset.py file like they do for calls
> to __iter__(). Very strange.

While looking into the weakset code, you might check the tracker for
weakset issues. And also check the test code. I have *no* idea how well
that class has been exercised and tested. Please do submit a patch if
you can if one is needed.

--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list


steve+comp.lang.python at pearwood

Jun 3, 2012, 9:20 AM

Post #7 of 8 (188 views)
Permalink
Re: CPython 2.7: Weakset data changing size during internal iteration [In reply to]

On Fri, 01 Jun 2012 20:24:30 -0700, Temia Eszteri wrote:

> On 02 Jun 2012 03:05:01 GMT, Steven D'Aprano
> <steve+comp.lang.python [at] pearwood> wrote:
>
>>I doubt that very much. If you are using threads, it is more likely your
>>code has a race condition where you are modifying a weak set at the same
>>time another thread is trying to iterate over it (in this case, to
>>determine it's length), and because it's a race condition, it only
>>happens when conditions are *just right*. Since race conditions hitting
>>are usually rare, you only notice it when there's a lot of data.
>
> Except that the few threads I use don't modify that data at all
[...]

And should I have known this from your initial post?


[...]
> Don't be so fast to dismiss things when the situation would not have
> made a race condition possible to begin with.

If you have been part of this newsgroup and mailing list as long as I
have, you should realise that there is no shortage of people who come
here and make grand claims that they have discovered a bug in Python
(either the language, or the standard library). Nine times out of ten,
they have not, and the bug is in their code, or their understanding.

Perhaps you are one of the few who has actually found a bug in the
standard library rather than one in your own code. But your initial post
showed no sign that you had done any investigation beyond reading the
traceback and immediately jumping to the conclusion that it was a bug in
the standard library.

Frankly, I still doubt that your analysis of the problem is correct:

[quote]
Problem is that for certain high-frequency operations, it
seems there's too much data going in and out for it to handle
[end quote]


I still can't see any way for this bug to occur due to "too much data",
as you suggest, or in the absence of one thread modifying the set while
another is iterating over it. But I could be wrong.

In any case, it appears that this bug has already been reported and fixed:

http://bugs.python.org/issue14159


Consider updating to the latest bug fix of 2.7.



--
Steven
--
http://mail.python.org/mailman/listinfo/python-list


lamialily at cleverpun

Jun 3, 2012, 10:55 AM

Post #8 of 8 (188 views)
Permalink
Re: CPython 2.7: Weakset data changing size during internal iteration [In reply to]

On 03 Jun 2012 16:20:11 GMT, Steven D'Aprano
<steve+comp.lang.python [at] pearwood> wrote:

>And should I have known this from your initial post?

I did discuss the matter with Terry Reedy, actually, but I guess since
the newsgroup-to-mailing list mirror is one-way, there's no actual way
you could've known. :/ Sigh, another problem out of my hands to deal
with. I do apologize for the snippy attitude, if it means anything.

>Frankly, I still doubt that your analysis of the problem is correct:
>
> [quote]
> Problem is that for certain high-frequency operations, it
> seems there's too much data going in and out for it to handle
> [end quote]
>
>
>I still can't see any way for this bug to occur due to "too much data",
>as you suggest, or in the absence of one thread modifying the set while
>another is iterating over it. But I could be wrong.

Well, in this case, I'd consider it more reasonable to look at it from
a different angle, but it was rather poorly-phrased at the beginning.
When you've got dozens of objects being garbage-collected from the set
every 16 miliseconds or so though, that's certainly high-frequency
enough to trigger the bug, is it not?

>In any case, it appears that this bug has already been reported and fixed:
>
>http://bugs.python.org/issue14159
>
>Consider updating to the latest bug fix of 2.7.

Alas, I'm already on the latest official release, which doesn't have
the patch yet. I'll just apply it manually.

Though now I'm now curious about how regular sets get their truth
value, since weaksets internally performing a length check every time
a texture was being referenced or de-referenced, for simple lack of a
faster explicit __bool__ value, is going to be rather costly when
things'll be flying around and out of the screen area in large
quantities. Hoo boy.

~Temia
--
The amazing programming device: fuelled entirely by coffee, it codes while
awake and tests while asleep!
--
http://mail.python.org/mailman/listinfo/python-list

Python python RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.