Reputation: 3531
I am fairly new to Python and I have a question regarding performance when comparing strings. Both codes below seem to achieve what I want but is there any reason to use one of them instead of the other?
Option1:
if first_column_header == 'author name' or first_column_header == 'author' or first_column_header == 'name':
Option2:
if first_column_header in ['author name', 'author', 'name']:
Upvotes: 3
Views: 225
Reputation: 184211
If you have a lot of choices, say more than a dozen, and speed is really critical, then use a set
. It's the fastest to check membership, although there's some overhead since the item being checked needs to be hashed. Define the set ahead of time so it's not redefined each time the if
statement is executed.
first_column_names = {"author name", "author", "name"}
# In Python before 2.7, you must use `set()` instead:
first_column_names = set(("author name", "author", "name"))
if first_column_header in first_column_names:
But if speed is critical, what are you writing it in Python for to begin with? :-) Generally you'll want to go with what's more readable. In this situation, that'll be a list or a tuple. Defining a tuple literal is marginally faster so, with readability being equal, I'd go that way:
if first_column_header in ("author name", "author", "name"):
Upvotes: 3
Reputation: 29794
Option 2 is definitely shorter and more pythonic. It's also possible that it adds a little more of overhead to your code because it creates a list and then iterates through it.
This is a trade off you'll have to accept by making programs more readable but, IMHO, this is too little overhead to worry so I'll go with Option 2.
Hope this helps!
Upvotes: 8
Reputation: 11180
I have to disagree with Paulo on the fact that Option 1 is faster. Here is what dis shows for these 2 functions:
def t():
if a == 'author name' or a == 'author' or a == 'name':
return True
return False
def t2():
if a in ['author name','author','name']:
return True
return False
It seems that a is loaded many times in the first case, and that the list in option 2 is created before the call.
3 0 LOAD_GLOBAL 0 (a)
3 LOAD_CONST 1 ('author name')
6 COMPARE_OP 2 (==)
9 POP_JUMP_IF_TRUE 36
12 LOAD_GLOBAL 0 (a)
15 LOAD_CONST 2 ('author')
18 COMPARE_OP 2 (==)
21 POP_JUMP_IF_TRUE 36
24 LOAD_GLOBAL 0 (a)
27 LOAD_CONST 3 ('name')
30 COMPARE_OP 2 (==)
33 POP_JUMP_IF_FALSE 40
4 >> 36 LOAD_CONST 4 (True)
39 RETURN_VALUE
5 >> 40 LOAD_CONST 5 (False)
43 RETURN_VALUE
8 0 LOAD_GLOBAL 0 (a)
3 LOAD_CONST 4 (('author name', 'author', 'name'))
6 COMPARE_OP 6 (in)
9 POP_JUMP_IF_FALSE 16
9 12 LOAD_CONST 5 (True)
15 RETURN_VALUE
10 >> 16 LOAD_CONST 6 (False)
19 RETURN_VALUE
Upvotes: 1