Reputation: 2485
I'm trying to optimize my Python 2.7.x code. I'm going to perform one operation inside a for loop, possibly millions of times, so I want it to be as quick as possible.
My operation is taking a list of 10 strings and converting them to 2 integers followed by 8 floats.
Here is a MWE of my attempts:
import timeit
words = ["1"] * 10
start_time = timeit.default_timer()
for ii in range(1000000):
values = map(float, words)
values[0] = int(values[0])
values[1] = int(values[1])
print "1", timeit.default_timer() - start_time
start_time = timeit.default_timer()
for ii in range(1000000):
values = map(int, words[:2]) + map(float, words[2:])
print "2", timeit.default_timer() - start_time
start_time = timeit.default_timer()
local_map = map
for ii in range(1000000):
values = local_map(float, words)
values[0] = int(values[0])
values[1] = int(values[1])
print "3", timeit.default_timer() - start_time
1 2.86574220657
2 3.83825802803
3 2.86320781708
The first block of code is the fastest I've managed. The map
function seems much quicker than using list comprehension. But there's still some redundancy because I map everything to a float, then change the first two items to integers.
Is there anything quicker than my code?
Why doesn't making the map function local, local_map = map
, improve the speed in the third block of code?
Upvotes: 1
Views: 221
Reputation: 155428
I haven't found anything faster, but your fastest code is actually going to be wrong in some cases. Problem is, Python float
(which is a C double) has limited precision, for values beyond 2 ** 53
(IIRC; might be off by one on bit count), it can't represent all integer values. By contrast, Python int
is arbitrary precision; if you have the memory, it can represent effectively infinite values.
You'd want to change:
values[0] = int(values[0])
values[1] = int(values[1])
to:
values[0] = int(words[0])
values[1] = int(words[1])
to avoid that. The reparsing would make this more dependent on the length of the string being parsed (because converting multiple times costs more for longer inputs).
An alternative that at least on my Python (3.5) works fairly fast is to preconstruct the set of converters so you can call the correct function directly. For example:
words = ["1"] * 10
converters = (int,) * 2 + (float,) * 8
values = [f(v) for f, v in zip(converters, words)]
You want to test with both versions of zip
to see if the list
generating version of the generator based itertools.izip
is faster (for short inputs like these, I really can't say). In Python 3.5 (where zip
is always a generator like Py2's itertools.izip
) this took about 10% longer than your fastest solution for the same inputs (I used min()
of a timeit.repeat
run rather than the hand-rolled version you used); it might do better if the inputs are larger (and therefore parsing twice would be more expensive).
Upvotes: 1