
bearophileHUGS at lycos
Nov 16, 2006, 4:44 PM
Post #1 of 5
(146 views)
Permalink
|
|
dict.reserve and other tricks
|
|
I have started doing practice creating C extensions for CPython, so here are two ideas I have had, possibly useless. If you keep adding elements to a CPython dict/set, it periodically rebuilds itself. So maybe dict.reserve(n) and a set.reserve(n) methods may help, reserving enough (empty) memory for about n *distinct* keys the programmer wants to add to the dict/set in a short future. I have seen that the the C API of the dicts doesn't allow this, and I don't know if this can be implemented modifying the dicts a bit. Do you think this may be useful? ------------------------------- Sometime I need to remove some elements from dicts/sets. If I know a rule that tells what elements have to be removed I can use code like this: import itertools def filterset(predicate, inset): return set(itertools.ifilter(predicate, inset)) print filterset(lambda x: x & 1, set(range(10))) Output: set([1, 3, 9, 5, 7]) And a similar one to filter dicts that works with a predicate(key, val). Most of the times such functions are good enough, but sometimes the dicts are big, so to reduce memory used I remove keys in place: def filterdict(pred, indict): todel = [k for k,v in indict.iteritems() if not pred(k,v)] for key in todel: del indict[key] d = dict.fromkeys(xrange(8), 0) print d filterdict(lambda k,v: k & 1, d) print d # Output: # {0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0} # {1: 0, 3: 0, 5: 0, 7: 0} But doing the same thing while iterating on the dict may be faster and use even less memory. This iteration&deletion capability is probably not safe enough to be used inside Python programs, but a compiled C module with a function that works like that filterdict (and deletes while iterating) may be created, and its use is safe from Python programs. The C++ STL API of maps allows to delete items while iterating on the map (but probably it has to be done only when necessary, because it may be a source for bugs): typedef map<int, string> isMap; typedef isMap::value_type isValType; typedef isMap::iterator isMapItor; void main() { isMap m; ... // items added to m for(isMapItor itr = m.begin(); itr != m.end();) { if(itr->second == "somestring") m.erase(itr++); else ++itr; } The Python/C API, 7.4.1 Dictionary Objects, says that's not possible with CPython dicts: >The dictionary p should not be mutated during iteration. It is safe (since Python 2.1) to modify the values of the keys as you iterate over the dictionary, but only so long as the set of keys does not change.< Do you think it may be useful to create to create such C filterdict function that deletes while iterating? (To create such function it probably has to bypass the CPython dict API). Bye, bearophile -- http://mail.python.org/mailman/listinfo/python-list
|