drsalists at gmail
Jun 5, 2012, 5:05 PM
Post #1 of 1
I've put together a comparison of some tree datastructures for Python, with
Python Tree datastructure comparison
varied runtime and varied workload:
I hope to find time to add heaps to the article at some point, but for now,
it only covers trees and the treap.
The short version:
1. The 2-3 tree gave incorrect results, so I eliminated it. This may or
may not have been my fault.
2. The Red-black tree was performing so slowly that it was making the
graphs hard to read, so I eliminated it. It actually seemed to be doing
pretty well at first, until I realized that garbage collections were
becoming very slow with red-black trees.
3. The top three performers were, under various workloads and with
various runtimes: The AVL Tree, The Splay Tree and The Treap.
4. The Treap was consistently either in first or second place.
5. All the datastructures examined were ported to python 3 in such a way
that they would continue to work on python 2 - I've at times taken
liberties like allowing log(n) range's, which of course are eagerly
evaluated in python 2. The modules were also modified to pass pylint.
These are all trees (and the treap), for now, with dictionary-like
interfaces, that allow you to do things like look up the least (or
greatest) value in log(n) time. The need for such a datastructure is
uncommon, but does come up from time to time - implementing a cache is
foremost in my mind. If you just need fast lookups without any particular
ordering, you're almost always going to be better off with a hash table