Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Python: Bugs

[issue10376] ZipFile unzip is unbuffered

 

 

Python bugs RSS feed   Index | Next | Previous | View Threaded


report at bugs

Apr 29, 2012, 4:56 AM

Post #1 of 6 (63 views)
Permalink
[issue10376] ZipFile unzip is unbuffered

Serhiy Storchaka <storchaka [at] gmail> added the comment:

Actually reading from the zip file is buffered (at least 4 KiB of uncompressed data at a time).

Can you give tests, scripts and data, which show the problem?

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue10376>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

May 1, 2012, 1:27 PM

Post #2 of 6 (57 views)
Permalink
[issue10376] ZipFile unzip is unbuffered [In reply to]

James Hutchison <jamesghutchison [at] gmail> added the comment:

See attached, which will open a zipfile that contains one file and reads it a bunch of times using unbuffered and buffered idioms. This was tested on windows using python 3.2

You're in charge of coming up with a file to test it on. Sorry.

Example output:

Enter filename: test.zip
Timing unbuffered read, 5 bytes at a time. 10 loops
took 6.671999931335449
Timing buffered read, 5 bytes at a time (4000 byte buffer). 10 loops
took 0.7350001335144043

----------
Added file: http://bugs.python.org/file25432/zipfiletest.py

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue10376>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

May 10, 2012, 3:24 PM

Post #3 of 6 (60 views)
Permalink
[issue10376] ZipFile unzip is unbuffered [In reply to]

Serhiy Storchaka <storchaka [at] gmail> added the comment:

This is not because zipfile module is unbuffered. This is the difference between expensive function call and cheap bytes slicing. Replace `zf.open(namelist [0])` to `io.BufferedReader(zf.open(namelist [0]))` to see the effect of a good buffering. In 3.2 zipfile read() implemented not optimal, so it slower (twice), but in 3.3 it will be almost as fast as using io.BufferedReader. It is still several times more slowly than bytes slicing, but there's nothing you can do with it.

Here is a patch, which is speeds up (+20%) the reading from a zip file by small chunks. Microbenchmark:

./python -m zipfile -c test.zip python
./python -m timeit -n 1 -s "import zipfile;zf=zipfile.ZipFile('test.zip')" "with zf.open('python') as f:" " while f.read(1):pass"

Python 3.3 (vanilla): 1 loops, best of 3: 36.4 sec per loop
Python 3.3 (patched): 1 loops, best of 3: 30.1 sec per loop
Python 3.3 (with io.BufferedReader): 1 loops, best of 3: 30.2 sec per loop
And, for comparison, Python 3.2: 1 loops, best of 3: 74.5 sec per loop

----------
components: -Documentation
keywords: +patch
versions: -Python 2.7, Python 3.2
Added file: http://bugs.python.org/file25530/zipfile_optimize_read.patch

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue10376>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

May 10, 2012, 3:26 PM

Post #4 of 6 (53 views)
Permalink
[issue10376] ZipFile unzip is unbuffered [In reply to]

Changes by STINNER Victor <victor.stinner [at] gmail>:


----------
nosy: +pitrou

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue10376>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

May 13, 2012, 11:32 AM

Post #5 of 6 (50 views)
Permalink
[issue10376] ZipFile unzip is unbuffered [In reply to]

Changes by Martin v. Löwis <martin [at] v>:


Added file: http://bugs.python.org/file25565/zipfile_optimize_read.patch

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue10376>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com


report at bugs

May 13, 2012, 11:36 AM

Post #6 of 6 (49 views)
Permalink
[issue10376] ZipFile unzip is unbuffered [In reply to]

Serhiy Storchaka <storchaka [at] gmail> added the comment:

Thank you, Martin, now I understood why not work Rietveld review.

----------

_______________________________________
Python tracker <report [at] bugs>
<http://bugs.python.org/issue10376>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/list-python-bugs%40lists.gossamer-threads.com

Python bugs RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.