VoidTwo
VoidTwo

Reputation: 637

Most Efficient Method to Concatenate Strings in Python

At the time of asking this question, I'm using Python 3.8

When I say efficient, I'm only referring to the speed at which the strings are concatenated, or in more technical terms: I'm asking about the time complexity, not accounting the space complexity.

The only methods I can think of at the moment are the following 3 given that:

a = 'start'
b = ' end'

Method 1

result = a + b

Method 2

result = ''.join((a, b))

Method 3

result = '{0}{1}'.format(a, b)

I want to know which of these methods are faster, or if there are other methods that are more efficient. Also, if you know if either of these methods performs differently with more strings or longer strings, please include that in your answer.

Edit

After seeing all the comments and answers, I have learned a couple of new ways to concatenate strings, and I have also learned about the timeit library. I will report my personal findings below:

>>> import timeit

>>> print(timeit.Timer('result = a + b', setup='a = "start"; b = " end"').timeit(number=10000))
0.0005306000000473432

>>> print(timeit.Timer('result = "".join((a, b))', setup='a = "start"; b = " end"').timeit(number=10000))
0.0011297000000354274

>>> print(timeit.Timer('result = "{0}{1}".format(a, b)', setup='a = "start"; b = " end"').timeit(number=10000))
0.002327799999989111

>>> print(timeit.Timer('result = f"{a}{b}"', setup='a = "start"; b = " end"').timeit(number=10000))
0.0005772000000092703

>>> print(timeit.Timer('result = "%s%s" % (a, b)', setup='a = "start"; b = " end"').timeit(number=10000))
0.0017815999999584164

It seems that for these small strings, the traditional a + b method is the fastest for string concatenation. Thanks for all of the answers!

Upvotes: 11

Views: 19482

Answers (3)

Attie
Attie

Reputation: 6969

Let's try it out! We can use timeit.timeit() to run a statement many times and return the overall duration.

Here, we use s to setup the variables a and b (not included in the overall time), and then run the various options 10 million times.

>>> from timeit import timeit
>>>
>>> n = 10 * 1000 * 1000
>>> s = "a = 'start'; b = ' end'"
>>>
>>> timeit("c = a + b",                 setup=s, number=n)
0.4452877212315798
>>>
>>> timeit("c = f'{a}{b}'",             setup=s, number=n)
0.5252049304544926
>>>
>>> timeit("c = '%s%s'.format(a, b)",   setup=s, number=n)
0.6849184390157461
>>>>
>>> timeit("c = ''.join((a, b))",       setup=s, number=n)
0.8546998891979456
>>>
>>> timeit("c = '%s%s' % (a, b)",       setup=s, number=n)
1.1699129864573479
>>>
>>> timeit("c = '{0}{1}'.format(a, b)", setup=s, number=n)
1.5954962372779846

This shows that unless your application's bottleneck is string concatenation, it's probably not worth being too concerned about...

  • The best case is ~0.45 seconds for 10 million iterations, or about 45ns per operation.
  • The worst case is ~1.59 seconds for 10 million iterations, or about 159ns per operation.

Depending on the performance of your system, you might see a speed improvement in the order of a few seconds if you're performing literally millions of operations.

Note that your results may vary quite drastically depending on the lengths (and number) of the strings you're concatenating, and the hardware you're running on.

Upvotes: 11

chepner
chepner

Reputation: 530970

For exactly two strings a and b, just use a + b. The alternatives are for joining more than 2 strings, avoiding the temporary str object created by each use of +, as well as the quadratic behavior due to repeatedly copying the contents of earlier operations in the next result.

(There's also f'{a}{b}', but it's syntactically heavier and no faster than a + b.)

Upvotes: 7

whege
whege

Reputation: 1441

from datetime import datetime
a = "start"
b = " end"

start = datetime.now()
print(a+b)
print(datetime.now() - start)

start = datetime.now()
print("".join((a, b)))
print(datetime.now() - start)

start = datetime.now()
print('{0}{1}'.format(a, b))
print(datetime.now() - start)

# Output
# start end
# 0:00:00.000056
# start end
# 0:00:00.000014
# start end
# 0:00:00.000014

Looks like .join() and .format() are basically the same and 4x faster. An F string, eg:

print(f'{a} {b}')

is also a very quick and clean method, especially when working with more complex formats.

Upvotes: 0

Related Questions