Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Python: Dev

dictnotes.txt out of date?

 

 

Python dev RSS feed   Index | Next | Previous | View Threaded


eliben at gmail

Jun 13, 2012, 12:43 AM

Post #1 of 4 (331 views)
Permalink
dictnotes.txt out of date?

Hi pydev,

I was looking at the memory allocation strategy of dict, out of
curiosity, and noted that Objects/dictnotes.txt is out of date as far
as the parameters go. It says about PyDict_STARTSIZE:

----
* PyDict_STARTSIZE. Starting size of dict (unless an instance dict).
Currently set to 8. Must be a power of two.
New dicts have to zero-out every cell.
Increasing improves the sparseness of small dictionaries but costs
time to read in the additional cache lines if they are not already
in cache. That case is common when keyword arguments are passed.
Prior to version 3.3, PyDict_MINSIZE was used as the starting size
of a new dict.
-----

Although it mentions 3.3, I find no reference to PyDict_STARTSIZE in
the code anywhere.
Also it mentions PyDict_MINSIZE, which doesn't exist any more - having
been replaced by Py_DICT_MINZISE_SPLIT and Py_DICT_COMBINED.

I don't know what else is out of date, just looked at those and they
were. Maybe it would make sense to kill dictnotes.txt, folding some of
its more important contents in to comments in dictobject.c, since the
latter has a higher chance of being maintained along with code
changes?

Eli
_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


mark at hotpy

Jun 13, 2012, 7:22 AM

Post #2 of 4 (303 views)
Permalink
Re: dictnotes.txt out of date? [In reply to]

Eli Bendersky wrote:
> Hi pydev,
>
> I was looking at the memory allocation strategy of dict, out of
> curiosity, and noted that Objects/dictnotes.txt is out of date as far
> as the parameters go. It says about PyDict_STARTSIZE:
>
> ----
> * PyDict_STARTSIZE. Starting size of dict (unless an instance dict).
> Currently set to 8. Must be a power of two.
> New dicts have to zero-out every cell.
> Increasing improves the sparseness of small dictionaries but costs
> time to read in the additional cache lines if they are not already
> in cache. That case is common when keyword arguments are passed.
> Prior to version 3.3, PyDict_MINSIZE was used as the starting size
> of a new dict.
> -----
>
> Although it mentions 3.3, I find no reference to PyDict_STARTSIZE in
> the code anywhere.
> Also it mentions PyDict_MINSIZE, which doesn't exist any more - having
> been replaced by Py_DICT_MINZISE_SPLIT and Py_DICT_COMBINED.

That's my fault. I didn't update dictnotes.txt when I changed
PyDict_STARTSIZE to PyDict_MINSIZE_COMBINED.

>
> I don't know what else is out of date, just looked at those and they
> were. Maybe it would make sense to kill dictnotes.txt, folding some of
> its more important contents in to comments in dictobject.c, since the
> latter has a higher chance of being maintained along with code
> changes?

I think that the parts of dictnotes.txt that just duplicate comments in
dictobject.c should be removed.
However, I think it is worth keeping dictnotes.txt as it has historical
information and results of previous experiments.

Cheers,
Mark
_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


eliben at gmail

Jun 13, 2012, 8:03 AM

Post #3 of 4 (301 views)
Permalink
Re: dictnotes.txt out of date? [In reply to]

>> I was looking at the memory allocation strategy of dict, out of
>> curiosity, and noted that Objects/dictnotes.txt is out of date as far
>> as the parameters go. It says about PyDict_STARTSIZE:
>>
>> ----
>> * PyDict_STARTSIZE. Starting size of dict (unless an instance dict).
>>    Currently set to 8. Must be a power of two.
>>    New dicts have to zero-out every cell.
>>    Increasing improves the sparseness of small dictionaries but costs
>>    time to read in the additional cache lines if they are not already
>>    in cache. That case is common when keyword arguments are passed.
>>    Prior to version 3.3, PyDict_MINSIZE was used as the starting size
>>    of a new dict.
>> -----
>>
>> Although it mentions 3.3, I find no reference to PyDict_STARTSIZE in
>> the code anywhere.
>> Also it mentions PyDict_MINSIZE, which doesn't exist any more - having
>> been replaced by Py_DICT_MINZISE_SPLIT and Py_DICT_COMBINED.
>
>
> That's my fault. I didn't update dictnotes.txt when I changed
> PyDict_STARTSIZE to PyDict_MINSIZE_COMBINED.

Could you update it now?

>> I don't know what else is out of date, just looked at those and they
>> were. Maybe it would make sense to kill dictnotes.txt, folding some of
>> its more important contents in to comments in dictobject.c, since the
>> latter has a higher chance of being maintained along with code
>> changes?
>
>
> I think that the parts of dictnotes.txt that just duplicate comments in
> dictobject.c should be removed.
> However, I think it is worth keeping dictnotes.txt as it has historical
> information and results of previous experiments.

Personally I think that describing the customization #defines belongs
in the source, above the relevant #defines, rather than in a separate
file. No problem with leaving historical notes and misc ruminations in
the separate .txt file, though.

Eli
_______________________________________________
Python-Dev mailing list
Python-Dev [at] python
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com


raymond.hettinger at gmail

Jun 13, 2012, 12:52 PM

Post #4 of 4 (301 views)
Permalink
Re: dictnotes.txt out of date? [In reply to]

On Jun 13, 2012, at 10:35 AM, Eli Bendersky wrote:

> Did you mean to send this to the list, Raymond?


Yes. I wanted to find-out whether someone approved changing
all the dict tunable parameters. I thought those weren't supposed
to have changed. PEP 412 notes that the existing parameters
were finely tuned and it did not make recommendations for changing them.

At one point, I spent a full month testing all of the tunable parameters
using dozens of popular Python applications. The values used in Py3.2
represent the best settings for most apps. They should not have been
changed without deep thought and substantial testing.

The reduction of the dict min size has an immediate impact on code
using multiple keyword arguments.

The reduced growth rate (from x4 to x2) negatively impacts apps that
have a dicts with a steady size but constantly changing keys
(removing an old key and adding a new one). The lru_cache is
an example. The reduced growth causes it to resize much more
frequently than before.

I think the tunable parameters should be switched back to what they
were before. Tim and others spent a lot of time getting those right
and my month of detailed testing confirmed that those were excellent
choices.

The introduction of Mark's shared-key dicts was orthogonal to the
issue of correct parameter settings. Those parameters did not have
to change and probably should not have been changed.


Raymond

Python dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.