Reputation: 333
l = [1,0,0,1,1]
count = 0
start = time()
for _ in range(100000):
for x in range(len(l)-1):
for y in range(x+1, len(l)):
if l[x] + l[y] == 1:
count += 1
end = time()
count2 = 0
start2 = time()
for _ in range(100000):
for x in range(len(l)-1):
for y in range(x+1, len(l)):
if l[x]^l[y]:
count2 += 1
end2 = time()
print str(count) + ' : Add and compare took: ' + str((end - start)/100000)
print str(count2) + ' : Bitwise took: ' + str((end2 - start2)/100000)
From what I understood of bitwise operations, they were supposed to be faster than simple comparisons. Yet these two loops end up being the exact same speed (without being overly nitpicky).
What is the benefit of using, arguably more complicated, bitwise operations over integer comparison when integer comparison appears to be just as fast?
Edit:
It appears that opcodes are not all created equal.
Austin's answer mentioned that the difference between the two operation was 3 opcodes vs 1, however, the following example has the same number of opcodes, but significantly different performance:
i = j = 10
def test1():
if i == j:
print True
def test2():
if not i-j:
print True
print 'test 1'
start1 = time()
test1()
end1 = time()
dis(test1)
print 'test 2'
start2 = time()
test2()
end2 = time()
dis(test2)
print 'Test 1 took: ' + str(end1 - start1)
print 'Test 2 took: ' + str(end2 - start2)
This will output:
test 1
True
25 0 LOAD_GLOBAL 0 (i)
3 LOAD_GLOBAL 1 (j)
6 COMPARE_OP 2 (==)
9 POP_JUMP_IF_FALSE 20
26 12 LOAD_GLOBAL 2 (True)
15 PRINT_ITEM
16 PRINT_NEWLINE
17 JUMP_FORWARD 0 (to 20)
>> 20 LOAD_CONST 0 (None)
23 RETURN_VALUE
test 2
True
29 0 LOAD_GLOBAL 0 (i)
3 LOAD_GLOBAL 1 (j)
6 BINARY_SUBTRACT
7 POP_JUMP_IF_TRUE 18
30 10 LOAD_GLOBAL 2 (True)
13 PRINT_ITEM
14 PRINT_NEWLINE
15 JUMP_FORWARD 0 (to 18)
>> 18 LOAD_CONST 0 (None)
21 RETURN_VALUE
Test 1 took: 7.86781311035e-06
Test 2 took: 5.00679016113e-06
Is there a more accurate way of measuring efficiency?
Edit:
Opcodes are created almost equal.
Modifying the code to exclude nasty I/O shows why I/O is problematic.
i = j = 10
bool1 = False
bool2 = False
def test1():
if i == j:
bool1 = True
def test2():
if not i-j:
bool2 = True
print 'test 1'
start1 = time()
for _ in range(1000000):
test1()
end1 = time()
dis(test1)
print 'test 2'
start2 = time()
for _ in range(1000000):
test2()
end2 = time()
dis(test2)
print str(bool1) + ' : Test 1 took: ' + str(end1 - start1)
print str(bool2) + ' : Test 2 took: ' + str(end2 - start2)
Will print:
test 1
27 0 LOAD_GLOBAL 0 (i)
3 LOAD_GLOBAL 1 (j)
6 COMPARE_OP 2 (==)
9 POP_JUMP_IF_FALSE 21
28 12 LOAD_GLOBAL 2 (True)
15 STORE_FAST 0 (bool1)
18 JUMP_FORWARD 0 (to 21)
>> 21 LOAD_CONST 0 (None)
24 RETURN_VALUE
test 2
31 0 LOAD_GLOBAL 0 (i)
3 LOAD_GLOBAL 1 (j)
6 BINARY_SUBTRACT
7 POP_JUMP_IF_TRUE 19
32 10 LOAD_GLOBAL 2 (True)
13 STORE_FAST 0 (bool2)
16 JUMP_FORWARD 0 (to 19)
>> 19 LOAD_CONST 0 (None)
22 RETURN_VALUE
False : Test 1 took: 0.156816959381
False : Test 2 took: 0.16281914711
So not as drastic, but still very slightly different. This was run ~12 times with Test 1 only taking longer once.
So there is still some mystery! Only not as drastic.
Upvotes: 1
Views: 71
Reputation: 15310
All the code in your loops are basically the same, so I eliminated them. Instead, I reduced your code to two functions, and asked my good friend dis.dis
to show me what they were doing:
l = [1,0,0,1,1]
def f1():
x = y = 0
if l[x] + l[y] == 1:
count += 1
def f2():
x = y = 0
if l[x]^l[y]:
count2 += 1
import dis
print "F1"
dis.dis(f1)
print "F2"
dis.dis(f2)
Here's the output:
$ python2.7 test.py
F1
4 0 LOAD_CONST 1 (0)
3 DUP_TOP
4 STORE_FAST 0 (x)
7 STORE_FAST 1 (y)
5 10 LOAD_GLOBAL 0 (l)
13 LOAD_FAST 0 (x)
16 BINARY_SUBSCR
17 LOAD_GLOBAL 0 (l)
20 LOAD_FAST 1 (y)
23 BINARY_SUBSCR
24 BINARY_ADD ## Here.
25 LOAD_CONST 2 (1) ## Here.
28 COMPARE_OP 2 (==) ## Here.
31 POP_JUMP_IF_FALSE 47
6 34 LOAD_FAST 2 (count)
37 LOAD_CONST 2 (1)
40 INPLACE_ADD
41 STORE_FAST 2 (count)
44 JUMP_FORWARD 0 (to 47)
>> 47 LOAD_CONST 0 (None)
50 RETURN_VALUE
F2
9 0 LOAD_CONST 1 (0)
3 DUP_TOP
4 STORE_FAST 0 (x)
7 STORE_FAST 1 (y)
10 10 LOAD_GLOBAL 0 (l)
13 LOAD_FAST 0 (x)
16 BINARY_SUBSCR
17 LOAD_GLOBAL 0 (l)
20 LOAD_FAST 1 (y)
23 BINARY_SUBSCR
24 BINARY_XOR ## Here.
25 POP_JUMP_IF_FALSE 41
11 28 LOAD_FAST 2 (count2)
31 LOAD_CONST 2 (1)
34 INPLACE_ADD
35 STORE_FAST 2 (count2)
38 JUMP_FORWARD 0 (to 41)
>> 41 LOAD_CONST 0 (None)
44 RETURN_VALUE
The difference is 3 opcodes versus 1. And the setup for those operations is 6 opcodes. The difference is lost in the noise.
Upvotes: 3