Reputation: 788
Hi all: I am new to Stack Overflow and am rather new to python, but I have been writing code for years and would like to know which of the following would be better performance.
Assume I have loaded envioron from os, and the flag in the environment is guaranteed to be either a "0" or "1".
if environ["Flag"] == "1":
do_something
or
if int(environ["Flag"]) == 1:
do something
At first glance, it looks like the conversion to int, then comparison would be slower because of the conversion, however, I know string comparisons can be slow also.
Has anyone ever examined this?
Thanks, Mark.
Upvotes: 7
Views: 7694
Reputation: 24731
Below is some quick dirty comparison.
import time
s1 = '10000000000000000000001'
s2 = '10000000000000000000002'
Approach 1: Does not type casts strings to ints
Makes sense when numeric strings are of equal length as '1000' < '2'
since lexicographically '1' < '2'
.
Preferrable when you receive numbers at different times in execution flow.
t1 = time.time()
for i in range(10000000):
if s1<s2:
pass
print(time.time() - t1)
Example output:
0.5940780639648438
Approach 2: Type casts string every time its compared with Preferrable when you are dealing with
"numeric inequality involving numeric strings of different lengths" (note that for equality comparison of numeric strings of different length, there is obviously no need of type casting) and
you receive numbers at different times, that is not all at once at the beginning in which case approach 3 is obviously more suitable.
t1 = time.time()
for i in range(10000000):
if int(s1)<int(s2):
pass
print(time.time() - t1)
Example output:
4.108525276184082
Approach 3: Type casts string every time its compared with.
n1 = int(s1)
n2 = int(s2)
t1 = time.time()
for i in range(10000000):
if n1<n2:
pass
print(time.time() - t1)
Example output:
0.5334858894348145
Upvotes: 0
Reputation:
The others are right in that when in doubt, time it.
But here's a bit of explanation:
When you compare two strings, the algorithm looks something like this:
from 0 to the length of the shortest string
if characters at this position are different
return false
return true
So the speed of a string comparison is entirely based on how much of the strings are equal. In your example, you are comparing to "1", a one character string. So in your case it boils down to:
if environ["Flag"][0] == "1"[0]
In other words, it is comparing a single byte to another single byte. Obviously a single comparison is going to be fast.
In your second case, you convert the string to an int
. This takes a bit of time. But if we assume best case, and that the flag is always "0" or "1", it's probably something like:
i = s[0] - ord("0")
Then you compare two integers. Integers are four bytes, not one, but that probably doesn't matter on modern chips.
But in any case, this means that when you compare two strings, you are doing a single comparison. When you convert to int, you are doing the work of the conversion, then doing a single comparison. Hence, the string comparison is faster.
But again, this is situational. It is faster because you are comparing two strings of length 1. Comparing two ints is of constant speed but comparing two strings is proportional to the length of the shorter string.
Finally, taking a flag out of an environment variable is something you do only once per run. We're talking about a couple hundred nanoseconds in something you do once. Differences of that scale are only worth worrying about in loops that run many, many times. In this case, don't bother with performance and worry about what reads better. (Which is probably still the string comparison version.)
Upvotes: 2
Reputation: 180441
In [44]: timeit int("1") == 1
1000000 loops, best of 3: 380 ns per loop
In [44]: timeit "1" == "1"
10000000 loops, best of 3: 36.5 ns per loop
Casting to int will always be slower which makes perfect sense, you start out with a string then convert to an int instead of just creating a string.
Converting is the most costly part:
In [45]: timeit 1
100000000 loops, best of 3: 11.9 ns per loop
In [46]: timeit "1"
100000000 loops, best of 3: 11 ns per loop
In [47]: timeit int("1")
1000000 loops, best of 3: 366 ns per loop
There is a difference between creating a string using a = "1"
than doing a = 1 b = str(1)
which is where you may have gotten confused`.
In [3]: a = 1
In [4]: timeit str(b)
10000000 loops, best of 3: 135 ns per loop
timed using python2.7, the difference using python 3 is pretty much the same.
The output is from my ipython terminal using the ipython magic timeit function
Upvotes: 6
Reputation: 238279
Why not check it yourself:
import timeit
print(timeit.timeit('a="1"; a == "1"', number=10000))
print(timeit.timeit('a="1"; int(a) == 1', number=10000))
The result for me is:
0.0003461789892753586
0.0019836849969578907
Which would indicate that the string comparison is much faster.
Upvotes: 6