Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Python: Bugs

[issue7355] Struct incorrectly compiles format strings

 

 

Python bugs RSS feed   Index | Next | Previous | View Threaded


report at bugs

Nov 19, 2009, 1:42 AM

Post #1 of 8 (341 views)
Permalink
[issue7355] Struct incorrectly compiles format strings

New submission from Steve Krenzel <sgk284 [at] gmail>:

The struct module has a calcsize() method which reports the size of the data for a specified format
string. In some instances, to the best of my knowledge, this is wrong.

To repro:
>>> from struct import calcsize
>>> calcsize("ci")
8
>>> calcsize("ic")
5

The correct answer is 5 (a single byte character and a four byte int take up 5 bytes of space). For
some reason when a 'c' is followed by an 'i', this is wrong and instead allocates 4 bytes to the 'c'.
This has been verified in 2.6 and 2.5.

You can also repro this by using 's', '2c', and similar combinations in place of 'c'. as well as 'I'
in place of 'i'. This might effect other combinations as well.

----------
components: Library (Lib)
messages: 95467
nosy: sgk284
severity: normal
status: open
title: Struct incorrectly compiles format strings
type: behavior
versions: Python 2.5, Python 2.6

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue7355>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Nov 19, 2009, 4:46 AM

Post #2 of 8 (320 views)
Permalink
[issue7355] Struct incorrectly compiles format strings [In reply to]

Eric Smith <eric [at] trueblade> added the comment:

It's a padding issue, having to do with putting values at the correct
word boundaries.

----------
nosy: +eric.smith
resolution: -> invalid
stage: -> committed/rejected
status: open -> closed

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue7355>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Nov 19, 2009, 6:33 AM

Post #3 of 8 (313 views)
Permalink
[issue7355] Struct incorrectly compiles format strings [In reply to]

Mark Dickinson <dickinsm [at] gmail> added the comment:

What Eric said. You can see the padding explicitly in the results of
struct.pack:

>>> struct.pack("ci", '*', 0x12131415) # 8-byte result, 3 padding bytes
'*\x00\x00\x00\x15\x14\x13\x12'
>>> struct.pack("ic", 0x12131415, '*') # 5-byte result, no padding.
'\x15\x14\x13\x12*'

Note the 3 zero bytes in the first result string.

This gets reported frequently enough that I wonder whether the docs
should be rearranged and/or expanded. The existence of padding is
mentioned, but not particularly prominently or thoroughly.

----------
nosy: +mark.dickinson

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue7355>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Nov 19, 2009, 7:02 AM

Post #4 of 8 (314 views)
Permalink
[issue7355] Struct incorrectly compiles format strings [In reply to]

Mark Dickinson <dickinsm [at] gmail> added the comment:

Reopening for possible doc clarification. Suggestions welcome!

----------
assignee: -> mark.dickinson
components: +Documentation, Extension Modules -Library (Lib)
keywords: +easy
priority: -> low
resolution: invalid ->
stage: committed/rejected -> needs patch
status: closed -> open
versions: +Python 2.7, Python 3.1, Python 3.2 -Python 2.5

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue7355>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Nov 19, 2009, 11:03 AM

Post #5 of 8 (313 views)
Permalink
[issue7355] Struct incorrectly compiles format strings [In reply to]

Steve Krenzel <sgk284 [at] gmail> added the comment:

Just for clarification, why does "ci" get padded but "ic" doesn't?
While I agree that updating the documentation would help clarify,
perhaps either everything should be padded to word boundaries or
nothing should.

It is weird behavior that "ic" != "ci". If both formats were 8 bytes
then my first thought would have been "Oh, it's just getting padded",
but with this inconsistency it appeared as a bug.

Whatever the reason behind this discrepancy is, it should definitely be
included in the doc updates.

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue7355>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Nov 19, 2009, 11:07 AM

Post #6 of 8 (313 views)
Permalink
[issue7355] Struct incorrectly compiles format strings [In reply to]

Eric Smith <eric [at] trueblade> added the comment:

It's basically because nothing comes after it. If you put something
after it, such as a zero length integer, you'll see:

>>> from struct import calcsize
>>> calcsize("ci")
8
>>> calcsize("ic")
5
>>> calcsize("ic0i")
8

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue7355>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Nov 19, 2009, 11:10 AM

Post #7 of 8 (314 views)
Permalink
[issue7355] Struct incorrectly compiles format strings [In reply to]

Mark Dickinson <dickinsm [at] gmail> added the comment:

> Just for clarification, why does "ci" get padded but "ic" doesn't?

Because no padding is necessary in the second case: both the integer and
the character already start at a position that's a multiple of 4---the
integer at position 0 and the character at position 4.

In the first case, without padding, the integer wouldn't start at a word
boundary.

The aim is to make sure that the byte sequence output by struct.pack
matches the layout of a corresponding C struct. In the first case inter-
item padding is necessary to make that work, in the second it isn't.

You could argue that in the second case, Python should add trailing
padding, but I'm not sure what the point would be.

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue7355>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Nov 19, 2009, 11:47 AM

Post #8 of 8 (313 views)
Permalink
[issue7355] Struct incorrectly compiles format strings [In reply to]

Mark Dickinson <dickinsm [at] gmail> added the comment:

I'm half-convinced that struct.pack *should* ideally add trailing padding
in the same situation that C does, for consistency with C. Then calcsize
would match C's sizeof. If you're writing or reading a struct from C,
it's probably easiest/most natural to write or read sizeof(my_struct)
bytes, rather than worrying about stripping trailing padding for
efficiency.

I don't see a sensible way to make this change without breaking backwards
compatibility, though.

(Note: this still wouldn't mean that the calcsize result would be
independent of order: calcsize('cci') and calcsize('cic') would still be
different, for example, on a typical platform.)

Eric's solution of adding '0i' should be included in the documentation
update.

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue7355>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com

Python bugs RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.