Reputation: 1903
I have a program that runs in Python 2 and Python 3, but there is a drastic difference in speed. I understand a number of internal changes were made in the switch, but the difference in io.BufferedReader are really high. In both versions, I use io.BufferedReader because the main program loop only needs data one byte at a time. Here is an excerpt from the cProfile output for the script (see cumtime, not tottime):
Python 2:
ncalls tottime percall cumtime percall filename:lineno(function)
36984 0.188 0.000 0.545 0.000 io.py:929(read)
Python 3:
36996 0.063 0.000 0.063 0.000 {method 'read' of '_io.BufferedReader' objects}
When I print the object, both return something like io.BufferedReader
so I am certain they are both using BufferedReader.
Here is the code in question. See line 28. The caller is responsible for setting up bufstream. I used bufstream = io.open('testfile', 'rb')
Why is there such a drastic difference in speed of BufferedReader for reading single bytes in the files, and how can I "fix" the issue for Python 2.x? I am running Python 2.6 and Python 3.1.
Upvotes: 2
Views: 2825
Reputation: 82942
To give you a fuller answer, one would need to see your code (or, better, an executable precis of your code).
However a partial answer can be gleaned from your profile output: io.py
suggests that "Python 2" (for avoidance of doubt, give the actual version numbers) is implementing BufferedReader in Python, whereas _io.BufferedReader
suggests that "Python3" is implementing it in C.
Late-breaking news: Python 2.6's io.py
is over 64Kb and includes the following comment up the front :
# This is a prototype; hopefully eventually some of this will be
# reimplemented in C.
Python 2.7's io.py
is about 4Kb and appears to be a thin wrapper of an _io
module.
If you want real assistance with a workaround for 2.6, show your code.
Probable workaround for Python 2.6
Instead of:
test = io.open('test.bmp', 'rb')
do this:
test = open('test.bmp', 'rb')
Some rough timing figures, including the missing link (Python 2.7):
Windows 7 Pro, 32-bit, approx 5 Mb file, guts of code is:
while 1:
c = f.read(1)
if not c: break
2.6: io.open 20.4s, open 5.1s
2.7: io.open 3.3s, open 4.8s # io.open is better
3.1: io.open 3.6s, open 3.6s # effectively same code is used
So a better story seems to be this: In general, don't faff about with io.open unless you have good reason to e.g. you want 2.7 to go faster.
Upvotes: 6
Reputation: 10663
Using 2.7 should solve this. See PEP 3116 and Python 2.7 doc.
A part of module io is written in python in 2.6, while in 2.7+ the whole module is written in C
Upvotes: 4