Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Python: Python

An ordered dictionary for the Python library?

 

 

First page Previous page 1 2 Next page Last page  View All Python python RSS feed   Index | Next | Previous | View Threaded


m.n.summerfield at googlemail

Sep 12, 2007, 12:33 AM

Post #1 of 43 (308 views)
Permalink
An ordered dictionary for the Python library?

I feel that Python lacks one useful data structure: an ordered
dictionary.

I find such data structures v. useful in C++. I know that in Python
the sort function is v. fast, but often I prefer never to sort but
simply to use an ordered data structure in the first place.
(I'm aware that for ordered lists I can use the bisect module, but I
want an ordered key-value data structure.)

I think other people must find such things useful. There are three
implementations on the Python Cookbook site, and one on PyPI, all in
pure Python (plus I have my own implementation, also pure Python).

I would suppose that it would be better if it was implemented in C---
for example, my own pure Python ordered dict loads data about eight
times slower than the built-in dict. Nonetheless, I still find it
worth using for the convenience it offers.

Do other Python programmers feel this lack? Is this worth a PEP?

[.I originally asked about this on the P3K mailing list, but then
realised that it isn't version-specific really.]

--
http://mail.python.org/mailman/listinfo/python-list


michele.simionato at gmail

Sep 12, 2007, 12:47 AM

Post #2 of 43 (286 views)
Permalink
Re: An ordered dictionary for the Python library? [In reply to]

On Sep 12, 9:33 am, Mark Summerfield <m.n.summerfi...@googlemail.com>
wrote:
> I feel that Python lacks one useful data structure: an ordered
> dictionary.
>
> I find such data structures v. useful in C++. I know that in Python
> the sort function is v. fast, but often I prefer never to sort but
> simply to use an ordered data structure in the first place.
> (I'm aware that for ordered lists I can use the bisect module, but I
> want an ordered key-value data structure.)
>
> I think other people must find such things useful. There are three
> implementations on the Python Cookbook site, and one on PyPI, all in
> pure Python (plus I have my own implementation, also pure Python).
>
> I would suppose that it would be better if it was implemented in C---
> for example, my own pure Python ordered dict loads data about eight
> times slower than the built-in dict. Nonetheless, I still find it
> worth using for the convenience it offers.
>
> Do other Python programmers feel this lack? Is this worth a PEP?
>

Yes, this is a serious lack in the standard library.

Michele Simionato

--
http://mail.python.org/mailman/listinfo/python-list


s.mientki at id

Sep 12, 2007, 12:51 AM

Post #3 of 43 (286 views)
Permalink
Re: An ordered dictionary for the Python library? [In reply to]

Mark Summerfield wrote:
> I feel that Python lacks one useful data structure: an ordered
> dictionary.
>
> I find such data structures v. useful in C++. I know that in Python
> the sort function is v. fast, but often I prefer never to sort but
> simply to use an ordered data structure in the first place.
> (I'm aware that for ordered lists I can use the bisect module, but I
> want an ordered key-value data structure.)
>
> I think other people must find such things useful. There are three
> implementations on the Python Cookbook site, and one on PyPI, all in
> pure Python (plus I have my own implementation, also pure Python).
>
> I would suppose that it would be better if it was implemented in C---
> for example, my own pure Python ordered dict loads data about eight
> times slower than the built-in dict. Nonetheless, I still find it
> worth using for the convenience it offers.
>
> Do other Python programmers feel this lack? Is this worth a PEP?
>
>
Yes I think it's really useful,
(or at least I'm used to it in other languages ;-)
If you're going to extend the dictionary,
there's one other flag I'm continuously missing:
"case-insensitive" key.

cheers,
Stef Mientki

> [.I originally asked about this on the P3K mailing list, but then
> realised that it isn't version-specific really.]
>
>
--
http://mail.python.org/mailman/listinfo/python-list


BjornSteinarFjeldPettersen at gmail

Sep 12, 2007, 3:28 AM

Post #4 of 43 (286 views)
Permalink
Re: An ordered dictionary for the Python library? [In reply to]

On Sep 12, 9:33 am, Mark Summerfield <m.n.summerfi...@googlemail.com>
wrote:
> I feel that Python lacks one useful data structure: an ordered
> dictionary.
>
> I find such data structures v. useful in C++. I know that in Python
> the sort function is v. fast, but often I prefer never to sort but
> simply to use an ordered data structure in the first place.
> (I'm aware that for ordered lists I can use the bisect module, but I
> want an ordered key-value data structure.)
[...]
> Do other Python programmers feel this lack? Is this worth a PEP?

I usually make a distinction between a sorted dict, where iteration
(and potentially positional indexing) happens in sorted key order; and
an ordered dict where items maintain insertion order. I use the latter
all the time, and e.g. Django's model metaclass does some minor magic
to overcome the fact that field-order is lost by the time your
metaclass gets control, since the attributes are passed as a regular
dict.

-- bjorn

--
http://mail.python.org/mailman/listinfo/python-list


steve at REMOVE-THIS-cybersource

Sep 12, 2007, 5:42 AM

Post #5 of 43 (281 views)
Permalink
Re: An ordered dictionary for the Python library? [In reply to]

On Wed, 12 Sep 2007 07:33:45 +0000, Mark Summerfield wrote:

> I feel that Python lacks one useful data structure: an ordered
> dictionary.
>
> I find such data structures v. useful in C++.
[snip]


Personally, I've never missed an ordered dict. What do people use them
for?

In fact, I'm not sure what people mean by ordered dicts. I assume they
mean the keys are ordered, not the values. But ordered by what? Insertion
order? Modification order? Alphabetical order?


--
Steven.
--
http://mail.python.org/mailman/listinfo/python-list


michele.simionato at gmail

Sep 12, 2007, 5:46 AM

Post #6 of 43 (282 views)
Permalink
Re: An ordered dictionary for the Python library? [In reply to]

On Sep 12, 2:42 pm, Steven D'Aprano <st...@REMOVE-THIS-
cybersource.com.au> wrote:
> On Wed, 12 Sep 2007 07:33:45 +0000, Mark Summerfield wrote:
> In fact, I'm not sure what people mean by ordered dicts. I assume they
> mean the keys are ordered, not the values. But ordered by what? Insertion
> order? Modification order? Alphabetical order?

Insertion order.

M.S.

--
http://mail.python.org/mailman/listinfo/python-list


steve at REMOVE-THIS-cybersource

Sep 12, 2007, 5:52 AM

Post #7 of 43 (281 views)
Permalink
Re: An ordered dictionary for the Python library? [In reply to]

On Wed, 12 Sep 2007 09:51:02 +0200, stef wrote:

> If you're going to extend the dictionary, there's one other flag I'm
> continuously missing: "case-insensitive" key.

Completely untested and probably buggy as anything, but here's an
absolutely minimal case-insensitive dictionary.

class CaselessDict(dict):
def __setitem__(self, key, value):
super(CaselessDict, self).__setitem__(key.lower(), value)
def __getitem__(self, key):
return super(CaselessDict, self).__getitem__(key.lower())


--
Steven.
--
http://mail.python.org/mailman/listinfo/python-list


m.n.summerfield at googlemail

Sep 12, 2007, 6:54 AM

Post #8 of 43 (280 views)
Permalink
Re: An ordered dictionary for the Python library? [In reply to]

On 12 Sep, 13:46, Michele Simionato <michele.simion...@gmail.com>
wrote:
> On Sep 12, 2:42 pm, Steven D'Aprano <st...@REMOVE-THIS-
>
> cybersource.com.au> wrote:
> > On Wed, 12 Sep 2007 07:33:45 +0000, Mark Summerfield wrote:
> > In fact, I'm not sure what people mean by ordered dicts. I assume they
> > mean the keys are ordered, not the values. But ordered by what? Insertion
> > order? Modification order? Alphabetical order?
>
> Insertion order.
>
> M.S.


Actually I meant by key order, so insertion order doesn't matter at
all. If you need a dictionary-like data structure that respects
insertion order you could use a list of (key, value) tuples.

Another respondent asked about use cases.

I have found time and again that I needed (key, value) pairs where the
key is some string that provides a representation for human readers
and the value is some internal key (e.g., database ID) that the system
uses internally. In these cases I often need to present the user with
a list of items from which to choose and want to present them in
sorted order. Naturally, I could use an ordinary dict and then do
this:

for string in sorted(d.keys()):
process(string)

But what happens when I need to do this a *lot* and when the number of
items is hundreds or a few thousands? I'm having to sort again and
again, since it is often the case that the items in the list changes
during the runtime of the application. So my solution in C++ is to use
an ordered dictionary (a map in C++ jargon), which in Python means I
can simply write:

for string in od.keys():
process(string)

Another situation where this is useful is if you need to hold lists of
strings preserving case but want them to be presented case-
insensitively. Here, you populate the ordered dictionary like this:

for string in somelist:
od[string.lower()] = string

Now you can iterate over od in case-insensitive order and yet still
have access to the case-sensitive original, and never have to
explicitly sort the strings.

I think it comes down to the nail and hammer issue: if you have a
hammer all problems look nail shaped. In Python we have dict and list
so everything looks suitable for those data structures. But having had
a wrench (ordered dictionaries) in C++, I find that some problems need
a wrench. I think that once Python programmers have access to an
ordered dictionary some proportion of them will find it painful to
program without it, so I really think it belongs in the standard
library.

--
http://mail.python.org/mailman/listinfo/python-list


michele.simionato at gmail

Sep 12, 2007, 7:04 AM

Post #9 of 43 (279 views)
Permalink
Re: An ordered dictionary for the Python library? [In reply to]

On Sep 12, 3:54 pm, Mark Summerfield <m.n.summerfi...@googlemail.com>
wrote:
> On 12 Sep, 13:46, Michele Simionato <michele.simion...@gmail.com>
>
> Actually I meant by key order, so insertion order doesn't matter at
> all. If you need a dictionary-like data structure that respects
> insertion order you could use a list of (key, value) tuples.
>
> Another respondent asked about use cases.
>
> I have found time and again that I needed (key, value) pairs where the
> key is some string that provides a representation for human readers
> and the value is some internal key (e.g., database ID) that the system
> uses internally. In these cases I often need to present the user with
> a list of items from which to choose and want to present them in
> sorted order. Naturally, I could use an ordinary dict and then do
> this:
>
> for string in sorted(d.keys()):
> process(string)
>
> But what happens when I need to do this a *lot* and when the number of
> items is hundreds or a few thousands? I'm having to sort again and
> again, since it is often the case that the items in the list changes
> during the runtime of the application. So my solution in C++ is to use
> an ordered dictionary (a map in C++ jargon), which in Python means I
> can simply write:
>
> for string in od.keys():
> process(string)
>

For your use case I would wrap a list [(key, value)] with a dict-like
object and I would use the bisect module in the standard library to
keep
the inner list ordered.

M.S.

--
http://mail.python.org/mailman/listinfo/python-list


duncan.booth at invalid

Sep 12, 2007, 7:15 AM

Post #10 of 43 (279 views)
Permalink
Re: An ordered dictionary for the Python library? [In reply to]

Steven D'Aprano <steve[at]REMOVE-THIS-cybersource.com.au> wrote:

> In fact, I'm not sure what people mean by ordered dicts. I assume they
> mean the keys are ordered, not the values. But ordered by what?
> Insertion order? Modification order? Alphabetical order?

All of the above at different times. Usually I think they mean original
insertion order, I'd more often like modification order though.

Insertion order can certainly be useful: any situation where you currently
keep items in a list because the original order is important but also copy
them into a dictionary or set because you need fast lookup.
e.g. if you want only unique items out of a list but need to preserve order
of first occurrence.

Another example would be uml->code round tripping: you parse in an existing
file containing method definitions, then update the method definitions from
a UML model and write the file out again: ArchGenXML does exactly that, but
last I looked the methods were written out in whatever order the dictionary
stored them which really messes up version control. It really should
preserve the order that methods appeared in the file.

Modification order would be great for a cache. Touch items whenever they
are referenced and then whenever the cache gets too big just pop the oldest
until you get back to the desired size.

Alphabetical or other sorted order when you want to get smallest/largest or
next smaller/larger then a specific item. I don't agree you necessarily
mean ordered by key rather than value, I think you could specify any
sortkey and expect updating the value to restore the ordering. Maybe if you
are recording some stats and want to regularly output the top 10?
--
http://mail.python.org/mailman/listinfo/python-list


rschroev_nospam_ml at fastmail

Sep 12, 2007, 7:17 AM

Post #11 of 43 (280 views)
Permalink
Re: An ordered dictionary for the Python library? [In reply to]

Michele Simionato schreef:
> On Sep 12, 2:42 pm, Steven D'Aprano <st...@REMOVE-THIS-
> cybersource.com.au> wrote:
>> On Wed, 12 Sep 2007 07:33:45 +0000, Mark Summerfield wrote:
>> In fact, I'm not sure what people mean by ordered dicts. I assume they
>> mean the keys are ordered, not the values. But ordered by what? Insertion
>> order? Modification order? Alphabetical order?
>
> Insertion order.

Mark referred to C++ and the sort function, so I assume he meant
alphabetical order (or the equivalent for types other than strings).

--
If I have been able to see further, it was only because I stood
on the shoulders of giants. -- Isaac Newton

Roel Schroeven
--
http://mail.python.org/mailman/listinfo/python-list


whamil1 at entergy

Sep 12, 2007, 7:39 AM

Post #12 of 43 (280 views)
Permalink
RE: An ordered dictionary for the Python library? [In reply to]

> From: Michele Simionato
>
> On Sep 12, 3:54 pm, Mark Summerfield <m.n.summerfi...@googlemail.com>
> wrote:
> > On 12 Sep, 13:46, Michele Simionato <michele.simion...@gmail.com>
> >
> > Actually I meant by key order, so insertion order doesn't matter at
> > all. If you need a dictionary-like data structure that respects
> > insertion order you could use a list of (key, value) tuples.
> >
> > Another respondent asked about use cases.
> >
> > I have found time and again that I needed (key, value) pairs where
the
> > key is some string that provides a representation for human readers
> > and the value is some internal key (e.g., database ID) that the
system
> > uses internally. In these cases I often need to present the user
with
> > a list of items from which to choose and want to present them in
> > sorted order. Naturally, I could use an ordinary dict and then do
> > this:
> >
> > for string in sorted(d.keys()):
> > process(string)
> >
> > But what happens when I need to do this a *lot* and when the number
of
> > items is hundreds or a few thousands? I'm having to sort again and
> > again, since it is often the case that the items in the list changes
> > during the runtime of the application. So my solution in C++ is to
use
> > an ordered dictionary (a map in C++ jargon), which in Python means I
> > can simply write:
> >
> > for string in od.keys():
> > process(string)
> >
>
> For your use case I would wrap a list [(key, value)] with a dict-like
> object and I would use the bisect module in the standard library to
> keep
> the inner list ordered.


Or subclass dict to carry along a sorted list of keys with it and return
that when dict.keys() is called. Even better, only have .keys() sort
the keys list when a key has been added to it since the last call.


--
-Bill Hamilton
--
http://mail.python.org/mailman/listinfo/python-list


arkanes at gmail

Sep 12, 2007, 8:49 AM

Post #13 of 43 (278 views)
Permalink
Re: An ordered dictionary for the Python library? [In reply to]

On 9/12/07, Mark Summerfield <m.n.summerfield[at]googlemail.com> wrote:
> On 12 Sep, 13:46, Michele Simionato <michele.simion...@gmail.com>
> wrote:
> > On Sep 12, 2:42 pm, Steven D'Aprano <st...@REMOVE-THIS-
> >
> > cybersource.com.au> wrote:
> > > On Wed, 12 Sep 2007 07:33:45 +0000, Mark Summerfield wrote:
> > > In fact, I'm not sure what people mean by ordered dicts. I assume they
> > > mean the keys are ordered, not the values. But ordered by what? Insertion
> > > order? Modification order? Alphabetical order?
> >
> > Insertion order.
> >
> > M.S.
>
>
> Actually I meant by key order, so insertion order doesn't matter at
> all. If you need a dictionary-like data structure that respects
> insertion order you could use a list of (key, value) tuples.
>

These sort of disagreements about what such a beast should "obviously"
do are a good part of the reason why one doesn't exist in the standard
library. A sorted dict is so trivial that it's not really seen as
necessary, since all you have to do is sort the keys(). An ordered
dict is generally better satisfied with a list, as you've seen.
There's implementations of all these for people who want something
wrapped up in the cookbook.


>But what happens when I need to do this a *lot* and when the number of
>items is hundreds or a few thousands?

keys = sorted(d.keys())
for k in key:
...

You've got two choices: either you insert more than you iterate, in
which case sorting at the point of iteration is the most efficient
option, or you iterate more than you insert, in which case sorting
after you insert is easy. There's no way that an ordered dict
implementation could satisfy both constraints. Can you honestly think
of a use case where the performance of your sorting is a bottleneck
*and* it's a suitably general case that a standard lib class should
optimize for that specific case? Even in this short thread we've seen
3 different ideas of what the dict should do.
--
http://mail.python.org/mailman/listinfo/python-list


pavlovevidence at gmail

Sep 12, 2007, 9:06 AM

Post #14 of 43 (279 views)
Permalink
Re: An ordered dictionary for the Python library? [In reply to]

On Sep 12, 10:04 am, Michele Simionato <michele.simion...@gmail.com>
wrote:
> On Sep 12, 3:54 pm, Mark Summerfield <m.n.summerfi...@googlemail.com>
> wrote:
>
>
>
> > On 12 Sep, 13:46, Michele Simionato <michele.simion...@gmail.com>
>
> > Actually I meant by key order, so insertion order doesn't matter at
> > all. If you need a dictionary-like data structure that respects
> > insertion order you could use a list of (key, value) tuples.
>
> > Another respondent asked about use cases.
>
> > I have found time and again that I needed (key, value) pairs where the
> > key is some string that provides a representation for human readers
> > and the value is some internal key (e.g., database ID) that the system
> > uses internally. In these cases I often need to present the user with
> > a list of items from which to choose and want to present them in
> > sorted order. Naturally, I could use an ordinary dict and then do
> > this:
>
> > for string in sorted(d.keys()):
> > process(string)
>
> > But what happens when I need to do this a *lot* and when the number of
> > items is hundreds or a few thousands? I'm having to sort again and
> > again, since it is often the case that the items in the list changes
> > during the runtime of the application. So my solution in C++ is to use
> > an ordered dictionary (a map in C++ jargon), which in Python means I
> > can simply write:
>
> > for string in od.keys():
> > process(string)
>
> For your use case I would wrap a list [(key, value)] with a dict-like
> object and I would use the bisect module in the standard library to
> keep
> the inner list ordered.

That would make insertion and deletion would be quite expensive.
Which might be ok for the OP, but it's often better to go with a
btree.

Which, you know, would be a reasonable candidate for collections
module.


Carl Banks

--
http://mail.python.org/mailman/listinfo/python-list


pavlovevidence at gmail

Sep 12, 2007, 9:18 AM

Post #15 of 43 (279 views)
Permalink
Re: An ordered dictionary for the Python library? [In reply to]

On Sep 12, 8:42 am, Steven D'Aprano <st...@REMOVE-THIS-
cybersource.com.au> wrote:
> On Wed, 12 Sep 2007 07:33:45 +0000, Mark Summerfield wrote:
> > I feel that Python lacks one useful data structure: an ordered
> > dictionary.
>
> > I find such data structures v. useful in C++.
>
> [snip]
>
> Personally, I've never missed an ordered dict. What do people use them
> for?

I once had a GUI that displayed the contents of a dict as a tree. The
user could add entries, and the entries would display at the end.
Hence, the ordered dict.

I could have kept the order separately, but what was the point? I
just whipped up a homemade ordered dict class, it worked fine, and I
didn't have to worry about keeping them synchronized.


Carl Banks

--
http://mail.python.org/mailman/listinfo/python-list


horpner at yahoo

Sep 12, 2007, 9:58 AM

Post #16 of 43 (279 views)
Permalink
Re: An ordered dictionary for the Python library? [In reply to]

On 2007-09-12, Mark Summerfield <m.n.summerfield[at]googlemail.com>
wrote:
> Actually I meant by key order, so insertion order doesn't
> matter at all. If you need a dictionary-like data structure
> that respects insertion order you could use a list of (key,
> value) tuples.
>
> Another respondent asked about use cases.

A mapping with sorted keys could theoretically be convenient when
your keys are comparable but not hashable.

--
Neil Cerutti
--
http://mail.python.org/mailman/listinfo/python-list


m.n.summerfield at googlemail

Sep 12, 2007, 10:07 AM

Post #17 of 43 (278 views)
Permalink
Re: An ordered dictionary for the Python library? [In reply to]

On 12 Sep, 15:04, Michele Simionato <michele.simion...@gmail.com>
wrote:
> On Sep 12, 3:54 pm, Mark Summerfield <m.n.summerfi...@googlemail.com>
> wrote:
>
>
>
> > On 12 Sep, 13:46, Michele Simionato <michele.simion...@gmail.com>
>
> > Actually I meant by key order, so insertion order doesn't matter at
> > all. If you need a dictionary-like data structure that respects
> > insertion order you could use a list of (key, value) tuples.
>
> > Another respondent asked about use cases.
>
> > I have found time and again that I needed (key, value) pairs where the
> > key is some string that provides a representation for human readers
> > and the value is some internal key (e.g., database ID) that the system
> > uses internally. In these cases I often need to present the user with
> > a list of items from which to choose and want to present them in
> > sorted order. Naturally, I could use an ordinary dict and then do
> > this:
>
> > for string in sorted(d.keys()):
> > process(string)
>
> > But what happens when I need to do this a *lot* and when the number of
> > items is hundreds or a few thousands? I'm having to sort again and
> > again, since it is often the case that the items in the list changes
> > during the runtime of the application. So my solution in C++ is to use
> > an ordered dictionary (a map in C++ jargon), which in Python means I
> > can simply write:
>
> > for string in od.keys():
> > process(string)
>
> For your use case I would wrap a list [(key, value)] with a dict-like
> object and I would use the bisect module in the standard library to
> keep
> the inner list ordered.
>
> M.S.

Actually, in my own ordereddict I wrap a dict and keep the keys in
order using the bisect module.
This works fine, but I imagine that a C-based implementation based on
B*trees or skiplists would be a lot faster.

In response to some of the following emails, I note one responder says
such a class is trivial to implement. Yes, and so presumably is
defaultdict, but the latter is in the standard library. One of the
"problems" of Python is that the range of skills of its users varies
from casual "amateur" programmers to guru professionals and everywhere
inbetween. Sure at somewhere on that continuum implementing *anything*
is "trivial", but for some users it is worthwhile to have it out of
the box.

What I like about ordered dictionaries is the ability to have sorted
data without sorting. It lets me have multiple sort orders. For
example, in some programs I give the user the choice of how they want
their data ordered and can provide such orderings without sorting,
simply by maintaining a set of parallel data structures with keys
being ordered and values being pointers (object references in Python)
to the relevant values. This costs in memory (but not much since no
values are duplicated and the values are usually large the keys
usually small).

Psuedocode example:

class Person:
def __init__(self, name, payrollno, grade, dept):
self.name = name
# etc

For each person created:
personFromName["%s\t%012d" % (person.name.lower(),
person.payrollno)] = person
personFromPayrollNo[person.payrollno] = person
personFromGrade["%s\t%012d" % (person.grade, person.payrollno)] =
person
personFromDept["%s\t%012d" % (person.dept, person.payrollno)] =
person

So now I have four orderings with no sorting. (I have to disambiguate
to avoid duplicates and to provide subordering.)

If I have 10,000 people I would potentially have to resort 10,000
instances whenever the user chose a new sort order: this way I just
have to iterate over the right ordered dictionary.

So I do feel that there are good use cases, and I think that for some
Python users having this in the library would be a genuine benefit.

Most of the use cases I envisage involve lots of insertions at start
up, relatively few during runtime, and lots of iterations over the
data as the user changes their view of it.

Of course Python does have an ordered dictionary, it is just not
necessarily always installed. You can use the bsdbd3 module's btopen()
method and pass None as filename to get an in-memory Btree, but I
believe (not sure) that there may be length limits on the keys.

--
http://mail.python.org/mailman/listinfo/python-list


rowen at cesmail

Sep 12, 2007, 12:08 PM

Post #18 of 43 (276 views)
Permalink
Re: An ordered dictionary for the Python library? [In reply to]

In article <13efnmi4n959a59[at]corp.supernews.com>,
Steven D'Aprano <steve[at]REMOVE-THIS-cybersource.com.au> wrote:

> On Wed, 12 Sep 2007 07:33:45 +0000, Mark Summerfield wrote:
>
> > I feel that Python lacks one useful data structure: an ordered
> > dictionary.
> >
> > I find such data structures v. useful in C++.
> [snip]
>
>
> Personally, I've never missed an ordered dict. What do people use them
> for?
>
> In fact, I'm not sure what people mean by ordered dicts. I assume they
> mean the keys are ordered, not the values. But ordered by what? Insertion
> order? Modification order? Alphabetical order?

Opinions vary, but I think of it this way:
- A sorted dict keeps its keys sorted. This has an obvious alternative
just sort the keys later (though I can imagine cases where it would be
nicer not to).

- An ordered dict remembers the order of insertion. This is useful when
you want to retain the order of insertion yet you also want fast lookup
of individual items. Unlike a sorted dict, there is no trivial
alternative (though you can argue about "trivial") and I really wish
there was. There are several free versions available.

-- Russell
--
http://mail.python.org/mailman/listinfo/python-list


tuomas.vesterinen at pp

Sep 12, 2007, 2:25 PM

Post #19 of 43 (276 views)
Permalink
Re: An ordered dictionary for the Python library? [In reply to]

Mark Summerfield wrote:
> I feel that Python lacks one useful data structure: an ordered
> dictionary.

Why it should be a dict. With it you can only maintain the order
x1<x2<x3... But by definition the order only requires x1<=x2<=x3...
There are many use cases where keys are not unique. I'd prefer a list
implementation, which would maintain the user defined order (cmp, key).
So you could map your objects with any order.

TV
--
http://mail.python.org/mailman/listinfo/python-list


"http://phr.cx" at NOSPAM

Sep 12, 2007, 4:03 PM

Post #20 of 43 (276 views)
Permalink
Re: An ordered dictionary for the Python library? [In reply to]

Mark Summerfield <m.n.summerfield[at]googlemail.com> writes:
> I feel that Python lacks one useful data structure: an ordered
> dictionary.
> Do other Python programmers feel this lack? Is this worth a PEP?

Yes. It should use a functional data structure.
--
http://mail.python.org/mailman/listinfo/python-list


m.n.summerfield at googlemail

Sep 12, 2007, 11:27 PM

Post #21 of 43 (270 views)
Permalink
Re: An ordered dictionary for the Python library? [In reply to]

On 13 Sep, 00:03, Paul Rubin <http://phr...@NOSPAM.invalid> wrote:
> Mark Summerfield <m.n.summerfi...@googlemail.com> writes:
> > I feel that Python lacks one useful data structure: an ordered
> > dictionary.
> > Do other Python programmers feel this lack? Is this worth a PEP?
>
> Yes. It should use a functional data structure.

Could you elaborate?

Here's my impression of the postings so far: there are 3 categories of
response:

(A) There is no need for such a thing.

(B) It is useful, but so easy to implement that it needn't be in the
library.

(C) It is worth having an ordered data structure of some kind.

Regarding (A) and (B): I remember Python before it had the sets
module. "There's no need for sets, you can do them with dicts".
Thankfully sets are in the language and I for one use them
extensively.

For (C) what I had in mind was:

An API that is identical to a dict, with the few additional methods/
behaviours listed below:
- If an item is inserted it is put in the right place (because the
underlying data structure, b*tree, skiplist or whatever is
intrinsically ordered by < on the key)
- Extra methods
key_at(index : int) -> key
item_at(index : int) -> (key, value)
value_at(index : int) -> value
set_value_at(index : int, value) # no return value necessary but
maybe key if convenient
od[x:y:z] -> list of values ordered by key # can read a slice but
not assign a slice
and maybe
keys_for(fromindex : int, uptobutexcludingindex : int) -> ordered
list of keys

I've given examples of the use cases I envisage (and that I use in
both Python with my own ordered dict and in C++ with the map class) in
a previous posting, and I've shown the API I envisage above. I know
that some people might prefer ordering by value or the option of case-
insensitive ordering, but both these can be achieved using the API
above, e.g.,
od[value.lower()] = value # case-insensitive ordering of list of
values
And as for duplicates there are two approaches I use: (1) make each
string unique as I showed in my previous post, or (2) use a value that
itself is a list, dict or set.
(As for those who have suggested an ordered data structure that isn't
a dict, I don't see a conflict, I'd be happy to see more data
structures in the collections module.)

Is what I've suggested sufficient (and sufficiently general) for the
standard library? If it is not, what should be done, or is there some
other approach and API that would better provide the functionality
envisaged?


--
http://mail.python.org/mailman/listinfo/python-list


apardon at forel

Sep 13, 2007, 12:06 AM

Post #22 of 43 (271 views)
Permalink
Re: An ordered dictionary for the Python library? [In reply to]

On 2007-09-12, Mark Summerfield <m.n.summerfield[at]googlemail.com> wrote:
> On 12 Sep, 13:46, Michele Simionato <michele.simion...@gmail.com>
> wrote:
>> On Sep 12, 2:42 pm, Steven D'Aprano <st...@REMOVE-THIS-
>>
>> cybersource.com.au> wrote:
>> > On Wed, 12 Sep 2007 07:33:45 +0000, Mark Summerfield wrote:
>> > In fact, I'm not sure what people mean by ordered dicts. I assume they
>> > mean the keys are ordered, not the values. But ordered by what? Insertion
>> > order? Modification order? Alphabetical order?
>>
>> Insertion order.
>>
>> M.S.
>
>
> Actually I meant by key order, so insertion order doesn't matter at
> all. If you need a dictionary-like data structure that respects
> insertion order you could use a list of (key, value) tuples.

If you want a dictionary that iterates over his keys and does so
in a sorted manner you probably want a tree.

One posible implemenation can be found here:

http://www.pardon-sleeuwaegen.be/antoon/avltree.html


I hope it is usefull to you.

--
Antoon Pardon
--
http://mail.python.org/mailman/listinfo/python-list


mensanator at aol

Sep 13, 2007, 10:19 AM

Post #23 of 43 (269 views)
Permalink
Re: An ordered dictionary for the Python library? [In reply to]

On Sep 13, 1:27 am, Mark Summerfield <m.n.summerfi...@googlemail.com>
wrote:
> On 13 Sep, 00:03, Paul Rubin <http://phr...@NOSPAM.invalid> wrote:
>
> > Mark Summerfield <m.n.summerfi...@googlemail.com> writes:
> > > I feel that Python lacks one useful data structure: an ordered
> > > dictionary.
> > > Do other Python programmers feel this lack? Is this worth a PEP?
>
> > Yes. It should use a functional data structure.
>
> Could you elaborate?
>
> Here's my impression of the postings so far: there are 3 categories of
> response:
>
> (A) There is no need for such a thing.
>
> (B) It is useful, but so easy to implement that it needn't be in the
> library.
>
> (C) It is worth having an ordered data structure of some kind.
>
> Regarding (A) and (B): I remember Python before it had the sets
> module. "There's no need for sets, you can do them with dicts".
> Thankfully sets are in the language and I for one use them
> extensively.
>
> For (C) what I had in mind was:
>
> An API that is identical to a dict, with the few additional methods/
> behaviours listed below:
> - If an item is inserted it is put in the right place (because the
> underlying data structure, b*tree, skiplist or whatever is
> intrinsically ordered by < on the key)
> - Extra methods
> key_at(index : int) -> key
> item_at(index : int) -> (key, value)
> value_at(index : int) -> value
> set_value_at(index : int, value) # no return value necessary but
> maybe key if convenient
> od[x:y:z] -> list of values ordered by key # can read a slice but
> not assign a slice
> and maybe
> keys_for(fromindex : int, uptobutexcludingindex : int) -> ordered
> list of keys
>
> I've given examples of the use cases I envisage (and that I use in
> both Python with my own ordered dict and in C++ with the map class) in
> a previous posting, and I've shown the API I envisage above. I know
> that some people might prefer ordering by value or the option of case-
> insensitive ordering, but both these can be achieved using the API
> above, e.g.,
> od[value.lower()] = value # case-insensitive ordering of list of
> values
> And as for duplicates there are two approaches I use: (1) make each
> string unique as I showed in my previous post, or (2) use a value that
> itself is a list, dict or set.
> (As for those who have suggested an ordered data structure that isn't
> a dict, I don't see a conflict, I'd be happy to see more data
> structures in the collections module.)
>
> Is what I've suggested sufficient (and sufficiently general) for the
> standard library?

Is the index the insertion order? The only use I have an ordered
dictionary is to quickly determine that a sequence value is the
first duplicate found (the structure can have millions of numbers)
and then play back (so they don't have to be re-calculated) the
sequence from the first duplicate to the end.

So I would get the index when I find the duplicate and then
iterate data[index:] to get the sequence in insertion order?


> If it is not, what should be done, or is there some
> other approach and API that would better provide the functionality
> envisaged?


--
http://mail.python.org/mailman/listinfo/python-list


"http://phr.cx" at NOSPAM

Sep 13, 2007, 1:14 PM

Post #24 of 43 (258 views)
Permalink
Re: An ordered dictionary for the Python library? [In reply to]

Mark Summerfield <m.n.summerfield[at]googlemail.com> writes:
> > Yes. It should use a functional data structure.
> Could you elaborate?

I mean use a data structure like an AVL tree, that can make a new dict
object quickly, whose contents are the same as the old object except
for one element (i.e. most of the contents are shared with the old
dict). You should also have ordered versions of both lists and sets.

Consider the situation with lists:

a = some 1000 element list
b = a + [23]

Now a has 1000 elements and b has 1001 elements (23 followed by a's
elements), but constructing b required copying all of a's elements.

It's sort of the same with sets:

a = 1000 element set
b = a + set((23,))

b had to copy the contents of a. By using a tree structure, you can
construct b in O(log(1000)) operations so that most of the elements
are shared with a, while at the same time you don't mutate a. That
makes it a lot easier to program in a style where you have a bunch of
slightly different versions of some set or dict flying around. For
an example of where this came up, see this thread:

http://groups.google.com/group/comp.lang.python/browse_thread/thread/78b5953488d772e9/82f701c302122777

For sample implemetations and API's, you could look at the Hedgehog
Lisp version of AVL trees (including a Python dict-like interface):

http://hedgehog.oliotalo.fi/

and Haskell's Data.Map library:

http://www.haskell.org/ghc/docs/latest/html/libraries/base/Data-Map.html

I think there is something similar in the ML Basis library but I'm not
familiar with that.

For more about functional data structures, see some of Chris Okasaki's
articles:

http://www.eecs.usma.edu/webs/people/okasaki/pubs.html

Dr. Okasaki also has written a book about the topic, called "Purely
Functional Data Structures", which is very good.

That said, in Python maybe this stuff would be easier to use
improperly because Python coding style uses mutation so much. It
might be best to limit sharing to the case of frozen dicts and frozen
sets.
--
http://mail.python.org/mailman/listinfo/python-list


jstroud at mbi

Sep 13, 2007, 1:20 PM

Post #25 of 43 (255 views)
Permalink
Re: An ordered dictionary for the Python library? [In reply to]

Mark Summerfield wrote:
> - If an item is inserted it is put in the right place (because the
> underlying data structure, b*tree, skiplist or whatever is
> intrinsically ordered by < on the key)

+1 for all your suggestions below, but -1 for the above. You suggest that

myOrderedDict['key'] = value

would place it in the dictionary in sorted order depending on 'key'.
More natural to some of us (or maybe its just me) would be to append the
key/value pair to the end of items.

My Argument: The problem with inserting at sorted position is the
assumption that the dict is already sorted, but maybe I want the key
ordering to reflect something arbitrary, like a database, etc (e.g.
'Last', 'First', 'MI'). So now I assign to an OrderedDict that already
has these latter 3 keys:

myOrderedDict['Favorite Color'] = 'chartruse'

Where does it go? Answer: it probably depends on how the underlying
sorter works and (even worse) the internal state of the sorter's data
structure. So, my intuition is that SortedDict behavior should be kept
distinct from OrderedDict behavior.

> - Extra methods
> key_at(index : int) -> key
> item_at(index : int) -> (key, value)
> value_at(index : int) -> value
> set_value_at(index : int, value) # no return value necessary but
> maybe key if convenient
> od[x:y:z] -> list of values ordered by key # can read a slice but
> not assign a slice

I do not see why assigning a slice is a bad idea here. The behavior
could be such:

od = OrderedDict((('Last':'Smith'), ('First':'John'), ('MI':'J'))

od[:1] = ['Brown']
od[:2] = ['Glotz', 'Joe']

od[:2] = ['Williams', 'Mary', 'Francis'] # <== ValueError

The latter would be a phenomenon similar to unpacking a tuple of the
wrong length.

> and maybe
> keys_for(fromindex : int, uptobutexcludingindex : int) -> ordered
> list of keys

Of course the naming is redundant for these in my opinion. E.g.:

key_at -> key
item_at -> item
value_at -> value
set_value_at -> set_value (not simply set)
keys_for -> keys (takes optional argument(s))

James
--
http://mail.python.org/mailman/listinfo/python-list

First page Previous page 1 2 Next page Last page  View All Python python RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact lists@gossamer-threads.com
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.