Reputation: 3328
python
library function namedtuple
from collections
referring https://github.com/python/cpython/blob/master/Lib/collections/init.py
def namedtuple(typename, field_names, *, verbose=False, rename=False, module=None):
# Validate the field names. At the user's option, either generate an error
# message or automatically replace the field name with a valid name.
if isinstance(field_names, str):
field_names = field_names.replace(',', ' ').split()
The last line of code above has replace(',', ' ').split()
other than split(',')
. I'm wondering what's the reason for it.
Here is the test code to measure the time cost:
from random import randrange
def create_str(n):
a = []
for _i in range(n):
a.append(str(randrange(101)))
return ','.join(a)
s = create_str(1000)
# print(s)
def test_a():
s.split(',')
def test_b():
s.replace(',', ' ').split()
if __name__ == '__main__':
import timeit
print(['test_a: ', timeit.timeit("test_a()", setup="from __main__ import test_a")])
print(['test_b: ', timeit.timeit("test_b()", setup="from __main__ import test_b")])
The output from the above:
['test_a: ', 59.938546671997756]
['test_b: ', 68.51630863297032]
s = create_str(10)
got the follows:
['test_a: ', 0.9246872899821028]
['test_b: ', 1.2178910280345008]
s = create_str(100)
got the follows:
['test_a: ', 6.570624853018671]
['test_b: ', 7.8685859580291435]
test_b
is faster anyway.
https://docs.python.org/3/library/collections.html#collections.namedtuple mentioned the follows:
The field_names are a sequence of strings such as ['x', 'y']. Alternatively, field_names can be a single string with each fieldname separated by whitespace and/or commas, for example 'x y' or 'x, y'.
Upvotes: 0
Views: 55
Reputation: 2243
Execution time difference aside, these two do not exactly do the same thing.
Consider a string 'a, b, c'
. Using the replace + split, it would result in ['a', 'b', 'c']
while splitting on ','
would result in ['a', ' b', ' c']
.
Asking whether the one or the other option is faster or slower is largely irrelevant since these operations (I mean using namedtuple()
) are generally done at import time.
So unless you are generating new namedtuple
types at runtime using dynamically generated string (not list) field names in a tight loop, the time difference is trivial.
Upvotes: 2