Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Python: Bugs

[issue15504] pickle/cPickle saves invalid/incomplete data

 

 

Python bugs RSS feed   Index | Next | Previous | View Threaded


report at bugs

Jul 30, 2012, 8:43 AM

Post #1 of 4 (82 views)
Permalink
[issue15504] pickle/cPickle saves invalid/incomplete data

New submission from Philipp Lies:

I just stumbled upon a very serious bug in cPickle where cPickle stores the data passed to it only partially without a warning/error:

#creating a >8GB long random data sting
import os
import cPickle
random_string = os.urandom(int(1.1*2**33))
print len(random_string)
fout = open('test.pickle', 'wb')
cPickle.dump(random_string, fout, 2)
fout.close()
fin = open('test.pickle', 'rb')
random_string2 = cPickle.load(fin)
print len(random_string2)
print random_string == random_string2

The loaded string is significantly shorter, meaning that some of the data got lost while storing the string. This is a serious issue. However, when I use pickle, writing fails with
error: 'i' format requires -2147483648 <= number <= 2147483647
so I guess pickle is not able to handle large data, therefore cPickle should either throw an error as well of pickle/cPickle should be patched to handle larger data.

Code to reproduce error using numpy (that's how I stumbled upon it):
import numpy as np
import cPickle as pickle
A = np.random.randn(1080,1920,553)
fout = open('test.pickle', 'wb')
pickle.dump(A, fout, 2)
fout.close()
fin = open('test.pickle', 'rb')
B = pickle.load(fin)
Here, numpy detects that the amount of data is wrong and throws an error. However, still serious because saving does not lead to an error so the user expects that the data are safely stored.

I guess might be related to http://bugs.python.org/issue13555 which is still open.

Python 2.7.3 on latest Ubuntu with numpy 1.6.2, 64bit architecture, 128GB RAM

----------
messages: 166906
nosy: Philipp.Lies
priority: normal
severity: normal
status: open
title: pickle/cPickle saves invalid/incomplete data
type: crash
versions: Python 2.7

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue15504>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Jul 30, 2012, 9:44 AM

Post #2 of 4 (82 views)
Permalink
[issue15504] pickle/cPickle saves invalid/incomplete data [In reply to]

Martin v. Löwis added the comment:

People can probably debate endless about seriousness of an issue. Keep in mind that two factors affect seriousness: what's the impact when it happens (here it is "quite bad"), and what's the chance that it happens (it's "quite low", since it requires you to pickle very long string objects, which only few people ever attempt). So these two cancel them out, in some form.

That said, I certainly agree that it needs to be fixed. AFAICT, the issue is that save_string uses "int" for size and len, when it should use Py_ssize_t. In addition, it shouldn't check for INT_MAX, but 0x7fffffff, since INT_MAX might be 2**63-1 on systems where int is a 64-bit type - but that should not be a problem on your system. I believe the bug exists in more cases; e.g. saving BINUNICODE.

Also, AFAICT, this shouldn't be a problem for 3.x, which already checks for overflow.

Then, AFAICT, there is a glitch in the BINUNICODE handling of 3.x, which rejects strings longer than 0xffffffff, when the maximum supported length really is 0x7fffffff.

----------
nosy: +loewis

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue15504>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Jul 30, 2012, 11:27 AM

Post #3 of 4 (77 views)
Permalink
[issue15504] pickle/cPickle saves invalid/incomplete data [In reply to]

Changes by Arfrever Frehtes Taifersar Arahesis <Arfrever.FTA [at] GMail>:


----------
nosy: +Arfrever

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue15504>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

Aug 6, 2012, 1:43 AM

Post #4 of 4 (69 views)
Permalink
[issue15504] pickle/cPickle saves invalid/incomplete data [In reply to]

Changes by Tshepang Lekhonkhobe <tshepang [at] gmail>:


----------
nosy: +tshepang

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue15504>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com

Python bugs RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.