Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Python: Python

When convert two sets with the same elements to lists, are the lists always going to be the same?

 

 

Python python RSS feed   Index | Next | Previous | View Threaded


pengyu.ut at gmail

May 3, 2012, 5:36 PM

Post #1 of 18 (937 views)
Permalink
When convert two sets with the same elements to lists, are the lists always going to be the same?

Hi,

list(a_set)

When convert two sets with the same elements to two lists, are the
lists always going to be the same (i.e., the elements in each list are
ordered the same)? Is it documented anywhere?

--
Regards,
Peng
--
http://mail.python.org/mailman/listinfo/python-list


drsalists at gmail

May 3, 2012, 5:56 PM

Post #2 of 18 (910 views)
Permalink
Re: When convert two sets with the same elements to lists, are the lists always going to be the same? [In reply to]

If you need the same ordering in two lists, you really should sort the
lists - though your comparison function need not be that traditional. You
might be able to get away with not sorting sometimes, but on CPython
upgrades or using different Python interpreters (Pypy, Jython), it's almost
certain the ordering will be allowed to change.

But sorting in a loop is not generally a good thing - there's almost always
a better alternative.

On Thu, May 3, 2012 at 5:36 PM, Peng Yu <pengyu.ut [at] gmail> wrote:

> Hi,
>
> list(a_set)
>
> When convert two sets with the same elements to two lists, are the
> lists always going to be the same (i.e., the elements in each list are
> ordered the same)? Is it documented anywhere?
>
> --
> Regards,
> Peng
> --
> http://mail.python.org/mailman/listinfo/python-list
>


python.list at tim

May 3, 2012, 6:00 PM

Post #3 of 18 (910 views)
Permalink
Re: When convert two sets with the same elements to lists, are the lists always going to be the same? [In reply to]

On 05/03/12 19:36, Peng Yu wrote:
> list(a_set)
>
> When convert two sets with the same elements to two lists, are the
> lists always going to be the same (i.e., the elements in each list are
> ordered the same)? Is it documented anywhere?

Sets are defined as unordered which the documentation[1] confirms.
A simple test seems to show that on cPython2.6 on this box, sets
with 1000000 elements converted to lists without sorting seem to
compare as equal (i.e., cPython is sorting), but I don't see any
guarantee that this should hold for other implementations, so I'd
sort first to ensure the intended behavior.

-tkc

[1]
http://docs.python.org/library/stdtypes.html#set-types-set-frozenset
"""
Being an unordered collection, sets do not record element position
or order of insertion.
"""




--
http://mail.python.org/mailman/listinfo/python-list


bahamutzero8825 at gmail

May 3, 2012, 7:06 PM

Post #4 of 18 (904 views)
Permalink
Re: When convert two sets with the same elements to lists, are the lists always going to be the same? [In reply to]

On 5/3/2012 7:36 PM, Peng Yu wrote:
> When convert two sets with the same elements to two lists, are the
> lists always going to be the same (i.e., the elements in each list are
> ordered the same)? Is it documented anywhere?
Sets are by definition unordered, so depending on their order would not
be a good idea. If the order stays the same, it's at most an
implementation detail which may or may not be consistent across
versions, and will likely not be consistent across implementations.

--
CPython 3.3.0a3 | Windows NT 6.1.7601.17790
--
http://mail.python.org/mailman/listinfo/python-list


tjreedy at udel

May 3, 2012, 9:16 PM

Post #5 of 18 (902 views)
Permalink
Re: When convert two sets with the same elements to lists, are the lists always going to be the same? [In reply to]

On 5/3/2012 8:36 PM, Peng Yu wrote:
> Hi,
>
> list(a_set)
>
> When convert two sets with the same elements to two lists, are the
> lists always going to be the same (i.e., the elements in each list are
> ordered the same)? Is it documented anywhere?

"A set object is an unordered collection of distinct hashable objects".
If you create a set from unequal objects with equal hashes, the
iteration order may (should, will) depend on the insertion order as the
first object added with a colliding hash will be at its 'natural
position in the hash table while succeeding objects will be elsewhere.

Python 3.3.0a3 (default, May 1 2012, 16:46:00)
>>> hash('a')
-292766495615408879
>>> hash(-292766495615408879)
-292766495615408879
>>> a = {'a', -292766495615408879}
>>> b = {-292766495615408879, 'a'}
>>> list(a)
[-292766495615408879, 'a']
>>> list(b)
['a', -292766495615408879]

--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list


pengyu.ut at gmail

May 4, 2012, 3:14 AM

Post #6 of 18 (900 views)
Permalink
Re: When convert two sets with the same elements to lists, are the lists always going to be the same? [In reply to]

On Thu, May 3, 2012 at 11:16 PM, Terry Reedy <tjreedy [at] udel> wrote:
> On 5/3/2012 8:36 PM, Peng Yu wrote:
>>
>> Hi,
>>
>> list(a_set)
>>
>> When convert two sets with the same elements to two lists, are the
>> lists always going to be the same (i.e., the elements in each list are
>> ordered the same)? Is it documented anywhere?
>
>
> "A set object is an unordered collection of distinct hashable objects".
> If you create a set from unequal objects with equal hashes, the iteration
> order may (should, will) depend on the insertion order as the first object
> added with a colliding hash will be at its 'natural position in the hash
> table while succeeding objects will be elsewhere.
>
> Python 3.3.0a3 (default, May †1 2012, 16:46:00)
>>>> hash('a')
> -292766495615408879
>>>> hash(-292766495615408879)
> -292766495615408879
>>>> a = {'a', -292766495615408879}
>>>> b = {-292766495615408879, 'a'}
>>>> list(a)
> [-292766495615408879, 'a']
>>>> list(b)
> ['a', -292766495615408879]

Thanks. This is what I'm looking for. I think that this should be
added to the python document as a manifestation (but nonnormalized) of
what "A set object is an unordered collection of distinct hashable
objects" means.

--
Regards,
Peng
--
http://mail.python.org/mailman/listinfo/python-list


rosuav at gmail

May 4, 2012, 4:21 AM

Post #7 of 18 (897 views)
Permalink
Re: When convert two sets with the same elements to lists, are the lists always going to be the same? [In reply to]

On Fri, May 4, 2012 at 8:14 PM, Peng Yu <pengyu.ut [at] gmail> wrote:
> Thanks. This is what I'm looking for. I think that this should be
> added to the python document as a manifestation (but nonnormalized) of
> what "A set object is an unordered collection of distinct hashable
> objects" means.

There are other things that can prove it to be unordered, too; the
exact pattern and order of additions and deletions can affect the
iteration order. The only thing you can be sure of is that you can't
be sure of it.

ChrisA
--
http://mail.python.org/mailman/listinfo/python-list


pengyu.ut at gmail

May 4, 2012, 5:00 AM

Post #8 of 18 (898 views)
Permalink
Re: When convert two sets with the same elements to lists, are the lists always going to be the same? [In reply to]

On Fri, May 4, 2012 at 6:21 AM, Chris Angelico <rosuav [at] gmail> wrote:
> On Fri, May 4, 2012 at 8:14 PM, Peng Yu <pengyu.ut [at] gmail> wrote:
>> Thanks. This is what I'm looking for. I think that this should be
>> added to the python document as a manifestation (but nonnormalized) of
>> what "A set object is an unordered collection of distinct hashable
>> objects" means.
>
> There are other things that can prove it to be unordered, too; the
> exact pattern and order of additions and deletions can affect the
> iteration order. The only thing you can be sure of is that you can't
> be sure of it.

I agree. My point was just to suggest adding more explanations on the
details in the manual.

--
Regards,
Peng
--
http://mail.python.org/mailman/listinfo/python-list


tjreedy at udel

May 4, 2012, 10:43 AM

Post #9 of 18 (899 views)
Permalink
Re: When convert two sets with the same elements to lists, are the lists always going to be the same? [In reply to]

On 5/4/2012 8:00 AM, Peng Yu wrote:
> On Fri, May 4, 2012 at 6:21 AM, Chris Angelico<rosuav [at] gmail> wrote:
>> On Fri, May 4, 2012 at 8:14 PM, Peng Yu<pengyu.ut [at] gmail> wrote:
>>> Thanks. This is what I'm looking for. I think that this should be
>>> added to the python document as a manifestation (but nonnormalized) of
>>> what "A set object is an unordered collection of distinct hashable
>>> objects" means.
>>
>> There are other things that can prove it to be unordered, too; the
>> exact pattern and order of additions and deletions can affect the
>> iteration order. The only thing you can be sure of is that you can't
>> be sure of it.
>
> I agree. My point was just to suggest adding more explanations on the
> details in the manual.

I am not sure how much clearer we can be in the language manual. The
word 'unordered' means just that. If one imposes an arbitrary linear
order on an unordered collection, it is arbitrary. It is frustrating
that people do not want to believe that, and even write tests depending
on today's arbitrary serialization order being deterministic
indefinitely. There is a section about this in the doctest doc, but
people do it anyway. I will think about a sentence to add.

--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list


pengyu.ut at gmail

May 4, 2012, 1:08 PM

Post #10 of 18 (900 views)
Permalink
Re: When convert two sets with the same elements to lists, are the lists always going to be the same? [In reply to]

On Fri, May 4, 2012 at 12:43 PM, Terry Reedy <tjreedy [at] udel> wrote:
> On 5/4/2012 8:00 AM, Peng Yu wrote:
>>
>> On Fri, May 4, 2012 at 6:21 AM, Chris Angelico<rosuav [at] gmail> †wrote:
>>>
>>> On Fri, May 4, 2012 at 8:14 PM, Peng Yu<pengyu.ut [at] gmail> †wrote:
>>>>
>>>> Thanks. This is what I'm looking for. I think that this should be
>>>> added to the python document as a manifestation (but nonnormalized) of
>>>> what "A set object is an unordered collection of distinct hashable
>>>> objects" means.
>>>
>>>
>>> There are other things that can prove it to be unordered, too; the
>>> exact pattern and order of additions and deletions can affect the
>>> iteration order. The only thing you can be sure of is that you can't
>>> be sure of it.
>>
>>
>> I agree. My point was just to suggest adding more explanations on the
>> details in the manual.
>
>
> I am not sure how much clearer we can be in the language manual. The word
> 'unordered' means just that. If one imposes an arbitrary linear order on an
> unordered collection, it is arbitrary. It is frustrating that people do not
> want to believe that, and even write tests depending on today's arbitrary
> serialization order being deterministic indefinitely. There is a section
> about this in the doctest doc, but people do it anyway. I will think about a
> sentence to add.

You can just add the example that you posted to demonstrate what the
unordered means. A curious user might want to know under what
condition the "unorderness" can affect the results, because for
trivial examples (like the following), it does seem that there is some
orderness in a set.

set(['a', 'b', 'c'])
set(['c', 'b', 'a'])

--
Regards,
Peng
--
http://mail.python.org/mailman/listinfo/python-list


cs at zip

May 4, 2012, 4:12 PM

Post #11 of 18 (900 views)
Permalink
Re: When convert two sets with the same elements to lists, are the lists always going to be the same? [In reply to]

On 04May2012 15:08, Peng Yu <pengyu.ut [at] gmail> wrote:
| On Fri, May 4, 2012 at 12:43 PM, Terry Reedy <tjreedy [at] udel> wrote:
| > On 5/4/2012 8:00 AM, Peng Yu wrote:
| >> On Fri, May 4, 2012 at 6:21 AM, Chris Angelico<rosuav [at] gmail>  wrote:
| >>> On Fri, May 4, 2012 at 8:14 PM, Peng Yu<pengyu.ut [at] gmail>  wrote:
| >>>> Thanks. This is what I'm looking for. I think that this should be
| >>>> added to the python document as a manifestation (but nonnormalized) of
| >>>> what "A set object is an unordered collection of distinct hashable
| >>>> objects" means.
| >>>
| >>> There are other things that can prove it to be unordered, too; the
| >>> exact pattern and order of additions and deletions can affect the
| >>> iteration order. The only thing you can be sure of is that you can't
| >>> be sure of it.
| >>
| >> I agree. My point was just to suggest adding more explanations on the
| >> details in the manual.
| >
| > I am not sure how much clearer we can be in the language manual. The word
| > 'unordered' means just that. [...]
|
| You can just add the example that you posted to demonstrate what the
| unordered means. A curious user might want to know under what
| condition the "unorderness" can affect the results, because for
| trivial examples (like the following), it does seem that there is some
| orderness in a set.

I'm with Terry here: anything else in the line you suggest would
complicate things for the reader, and potentially mislead.

Future implementation changes (and, indeed, _other_ implementations like
Jython) can change any of this. So there _are_ no ``condition the
"unorderness" can affect the results'': a set is unordered, and you
could even _legitimately_ get different orders from the same set
if you iterate over it twice. It is unlikely, but permissable.

Any attempt to describe such conditions beyond "it might happen at any
time" would be misleading.

| set(['a', 'b', 'c'])
| set(['c', 'b', 'a'])

The language does not say these will get the same iteration order. It
happens that the Python you're using, today, does that.

You can't learn the language specification from watching behaviour;
you learn the guarrenteed behaviour -- what you may rely on happening --
from the specification, and you can test that an implementation obeys (or
at any rate, does not disobey) the specification by watching behaviour.

You seem to be trying to learn the spec from behaviour.

Cheers,
--
Cameron Simpson <cs [at] zip> DoD#743
http://www.cskk.ezoshosting.com/cs/

Loud pipes make noise. Skill and experience save lives.
- EdBob Morandi
--
http://mail.python.org/mailman/listinfo/python-list


pengyu.ut at gmail

May 4, 2012, 4:37 PM

Post #12 of 18 (898 views)
Permalink
Re: When convert two sets with the same elements to lists, are the lists always going to be the same? [In reply to]

On Fri, May 4, 2012 at 6:12 PM, Cameron Simpson <cs [at] zip> wrote:
> On 04May2012 15:08, Peng Yu <pengyu.ut [at] gmail> wrote:
> | On Fri, May 4, 2012 at 12:43 PM, Terry Reedy <tjreedy [at] udel> wrote:
> | > On 5/4/2012 8:00 AM, Peng Yu wrote:
> | >> On Fri, May 4, 2012 at 6:21 AM, Chris Angelico<rosuav [at] gmail> †wrote:
> | >>> On Fri, May 4, 2012 at 8:14 PM, Peng Yu<pengyu.ut [at] gmail> †wrote:
> | >>>> Thanks. This is what I'm looking for. I think that this should be
> | >>>> added to the python document as a manifestation (but nonnormalized) of
> | >>>> what "A set object is an unordered collection of distinct hashable
> | >>>> objects" means.
> | >>>
> | >>> There are other things that can prove it to be unordered, too; the
> | >>> exact pattern and order of additions and deletions can affect the
> | >>> iteration order. The only thing you can be sure of is that you can't
> | >>> be sure of it.
> | >>
> | >> I agree. My point was just to suggest adding more explanations on the
> | >> details in the manual.
> | >
> | > I am not sure how much clearer we can be in the language manual. The word
> | > 'unordered' means just that. [...]
> |
> | You can just add the example that you posted to demonstrate what the
> | unordered means. A curious user might want to know under what
> | condition the "unorderness" can affect the results, because for
> | trivial examples (like the following), it does seem that there is some
> | orderness in a set.
>
> I'm with Terry here: anything else in the line you suggest would
> complicate things for the reader, and potentially mislead.
>
> Future implementation changes (and, indeed, _other_ implementations like
> Jython) can change any of this. So there _are_ no ``condition the
> "unorderness" can affect the results'': a set is unordered, and you
> could even _legitimately_ get different orders from the same set
> if you iterate over it twice. It is unlikely, but permissable.
>
> Any attempt to describe such conditions beyond "it might happen at any
> time" would be misleading.
>
> | set(['a', 'b', 'c'])
> | set(['c', 'b', 'a'])
>
> The language does not say these will get the same iteration order. It
> happens that the Python you're using, today, does that.
>
> You can't learn the language specification from watching behaviour;
> you learn the guarrenteed behaviour -- what you may rely on happening --
> from the specification, and you can test that an implementation obeys (or
> at any rate, does not disobey) the specification by watching behaviour.
>
> You seem to be trying to learn the spec from behaviour.

My point is if something is said in the document, it is better to be
substantiated by an example. I don't think that this has anything with
"learn the spec from behaviour."

--
Regards,
Peng
--
http://mail.python.org/mailman/listinfo/python-list


breamoreboy at yahoo

May 4, 2012, 5:31 PM

Post #13 of 18 (897 views)
Permalink
Re: When convert two sets with the same elements to lists, are the lists always going to be the same? [In reply to]

On 05/05/2012 00:37, Peng Yu wrote:
>
> My point is if something is said in the document, it is better to be
> substantiated by an example. I don't think that this has anything with
> "learn the spec from behaviour."
>

I side with the comments made by Terry Reedy and Cameron Simpson so
please give it a rest, you're flogging a dead horse.

--
Cheers.

Mark Lawrence.

--
http://mail.python.org/mailman/listinfo/python-list


tjreedy at udel

May 4, 2012, 9:28 PM

Post #14 of 18 (896 views)
Permalink
Re: When convert two sets with the same elements to lists, are the lists always going to be the same? [In reply to]

Peng, I actually am thinking about it.

Underlying problem: while unordered means conceptually unordered as far
as the collection goes, the items in the collection, if homogenous
enough, may have a natural order, which users find hard to ignore. Even
if not comparable, an implementation such as CPython that uses linear
sequential memory will impose some order. Even if the implementation
uses unordered (holographic?) memory, order will be imposed to iterate,
as when creating a serialized representation of the collection. Abstract
objects, concrete objects, and serialized representations are three
different things, but people tend to conflate them.

Order consistency issues: if the unordered collection is iterated, when
can one expect the order to be the same? Theoretically, essentially
never, except that iterating dicts by keys, values, or key-value pairs
is guaranteed to be consistent, which means that re-iterating has to be
consistent. I actually think the same might as well be true for sets,
although there is no doc that says so.

If two collections are equal, should the iteration order be the same? It
has always been true that if hash values collide, insertion order
matters. However, a good hash function avoids hash collisions as much as
possible in practical use cases. Without doing something artificial, as
I did with the example, collisions should be especially rare on 64-bit
builds. If one collection has a series of additions and deletions so
that the underlying hash table has a different size than an equal
collection build just from insertions, then order will also be different.

If the same collection is built by insertion in the same order, but in
different runs, bugfix versions, or language versions, will iteration
order by the same? Historically, it has been for CPython for about a
decade, and people has come to depend on that constancy, in spite of
warning not to. Even core developers have not been immune, as the
CPython test suite has a few set or dict iteration order dependencies
until it was recently randomized.

Late last year, it became obvious that this constancy is a practical
denial-of-service security hole. The option to randomize hashing for
each run was added to 2.6, 2.7, 3.1, and 3.2. Randomized hashing by run
is part of 3.3. So some of the discussion above is obsolete. The example
I gave only works for that one run, as hash('a') changes each run. So
iteration order now changes with each run in fact as well as in theory.

For the doc, the problem is what to say and where without being
repetitous (and to get multiple people to agree ;-).

--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list


pengyu.ut at gmail

May 5, 2012, 4:04 AM

Post #15 of 18 (895 views)
Permalink
Re: When convert two sets with the same elements to lists, are the lists always going to be the same? [In reply to]

Hi Terry,

Thank you for you detailed email.

> If two collections are equal, should the iteration order be the same? It has
> always been true that if hash values collide, insertion order matters.
> However, a good hash function avoids hash collisions as much as possible in
> practical use cases. Without doing something artificial, as I did with the
> example, collisions should be especially rare on 64-bit builds. If one
> collection has a series of additions and deletions so that the underlying
> hash table has a different size than an equal collection build just from
> insertions, then order will also be different.

The reason that I asked to add the artificial example in the doc is
because I never completely understand the unorderness until I see your
artificial example. You will surely use more words for explaining what
"unorderness" means than just showing your example. And the example
(since formatted as code) is more eye catching than just plain text.

For my case, since I didn't understand the unorderness, I made some
subtle bug in my program, which works fine in my testing code.
However, it produce a small amount of corrupted results for the real
production use, which is harder to debug. It did wasted quite some of
my time.

> For the doc, the problem is what to say and where without being repetitous
> (and to get multiple people to agree ;-).

I agree that people have different opinions on issues like this. But I
think that "The Customer Is God". Readers of the doc is the customers,
the writers of the doc is the producers. The opinion of customers
should carry more weight than producers.

--
Regards,
Peng
--
http://mail.python.org/mailman/listinfo/python-list


rosuav at gmail

May 5, 2012, 4:53 AM

Post #16 of 18 (897 views)
Permalink
Re: When convert two sets with the same elements to lists, are the lists always going to be the same? [In reply to]

On Sat, May 5, 2012 at 9:04 PM, Peng Yu <pengyu.ut [at] gmail> wrote:
> I agree that people have different opinions on issues like this. But I
> think that "The Customer Is God". Readers of the doc is the customers,
> the writers of the doc is the producers. The opinion of customers
> should carry more weight than producers.

Nah, the customer's not God, and I have proof. http://notalwaysright.com/

Oops, now everyone's off reading funny stories about stupid/abusive
customers. Well, when you're all back...

The reason you're reading documentation is to learn. You're not
handing over wads of cash and saying "Do stuff for me"; you're reading
the Player's Handbook and learning which dice to roll when. Perhaps
there's some jargon that you don't understand; in that case, either a
FAQ/glossary or a forum post will set you straight. But if the doc
says that something can't be relied upon, then there's nothing more to
write there.

Documentation that takes ten pages to say something is just as bad as
documentation that leaves stuff out, because it's almost guaranteed
that it won't be read.

ChrisA
--
http://mail.python.org/mailman/listinfo/python-list


emile at fenx

May 5, 2012, 9:04 AM

Post #17 of 18 (895 views)
Permalink
Re: When convert two sets with the same elements to lists, are the lists always going to be the same? [In reply to]

On 5/5/2012 4:04 AM Peng Yu said...

> I agree that people have different opinions on issues like this. But I
> think that "The Customer Is God". Readers of the doc is the customers,
> the writers of the doc is the producers. The opinion of customers
> should carry more weight than producers.

Only to a point. The experience of the producers should carry more
weight than the opinion of the customers.

Emile




--
http://mail.python.org/mailman/listinfo/python-list


pengyu.ut at gmail

May 5, 2012, 9:17 AM

Post #18 of 18 (899 views)
Permalink
Re: When convert two sets with the same elements to lists, are the lists always going to be the same? [In reply to]

> Documentation that takes ten pages to say something is just as bad as
> documentation that leaves stuff out, because it's almost guaranteed
> that it won't be read.

That's the point. If a simple example (6 lines) can demonstrate the
concept, why spending "ten pages" to explain it. My experience is that
for certain things, it is better describe by a spec once you know it,
but it is certainly not true for people to learn it. A reasonable
strategy is to interleave spec with demonstrating examples. There is
no excuse to not to make the manual easier to read.

--
Regards,
Peng
--
http://mail.python.org/mailman/listinfo/python-list

Python python RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.