Reputation: 2729
Say I want to make a shallow copy of a list in python. Which method is fastest?
I can think of
copy.copy(l)
l[:]
[x for x in l]
list(l)
This question is specifically about the relative speed of making a shallow copy of a list in Python. Specifically, I am interested in CPython.
Upvotes: -1
Views: 386
Reputation: 13
Using Python 3.12 on Windows 10 using the test code below, from fastest to slowest:
RESULTS:
.copy() - 0.0014513516970910132
[:] - 0.001602155597647652
[*] - 0.0016144087981665508
list() - 0.0018496818009298296
copy.copy() - 0.026636865100823342
copy.deepcopy() - 0.06005727069824934
_pickle - 0.12308293380064424
pickle - 0.12323138289933558
dill - 4.047985399956815
TEST CODE:
import random
import statistics
import copy
import timeit
import _pickle
import pickle
import dill
# -------------------------------------
rank_list = ['A', '2', '3', '4', '5', '6', '7', '8', '9', '10', 'J', 'Q', 'K', 'Jo']
class CustomList(list):
def __init__(self, name):
self.name = name
super().__init__()
class Card():
def __init__(self):
self.x = 1
self.y = 2
self.id = random.randint(1, 100)
self.rank = rank_list[random.randint(0, len(rank_list) - 1)]
def __repr__(self):
return repr(self.id)
# -------------------------------------
# Below Line - The outer list.
nested_list = CustomList('nested_list')
# Below Lines - The 3 inner lists.
inner_list_1 = CustomList('inner_list_1')
inner_list_2 = CustomList('inner_list_2')
inner_list_3 = CustomList('inner_list_3')
# -------------------------------------
# Below Sections - Appending random amount of cards to each inner list.
for x in range(5):
card = Card()
inner_list_1.append(card)
for x in range(6):
card = Card()
inner_list_2.append(card)
for x in range(8):
card = Card()
inner_list_3.append(card)
# -------------------------------------
# Below Section - Appending each inner list to the outer list.
nested_list.append(inner_list_1)
nested_list.append(inner_list_2)
nested_list.append(inner_list_3)
# -------------------------------------
# Below Function - custom .copy() deep copy function. Params - l = outer list.
def custom_copy(nested_list):
outer_copy = []
for inner_list in nested_list:
inner_list_copy = inner_list.copy()
outer_copy.append(inner_list_copy)
return outer_copy
# -------------------------------------
# Below Function - custom [:] deep copy function. Params - l = outer list.
def custom_slice(nested_list):
outer_copy = []
for inner_list in nested_list:
inner_list_copy = inner_list[:]
outer_copy.append(inner_list_copy)
return outer_copy
# -------------------------------------
# Below Function - custom [*] deep copy function. Params - l = outer list.
def custom_asterisk(nested_list):
outer_copy = []
for inner_list in nested_list:
inner_list_copy = [*inner_list]
outer_copy.append(inner_list_copy)
return outer_copy
# -------------------------------------
# Below Function - custom list() deep copy function. Params - l = outer list.
def custom_list_func(nested_list):
outer_copy = []
for inner_list in nested_list:
inner_list_copy = list(inner_list)
outer_copy.append(inner_list_copy)
return outer_copy
# -------------------------------------
# Below Function - custom copy.copy() deep copy function. Params - l = outer list.
def custom_copy_dot_copy(l):
outer_copy = []
for inner_list in nested_list:
inner_list_copy = copy.copy(inner_list)
outer_copy.append(inner_list_copy)
return outer_copy
# -------------------------------------
# Below Line - Iteration num for timeit's number param. Change this var to change all timeit() calls.
num = 1000
# -------------------------------------
# Below Section - 1x timeit() call for each copy method.
print('.copy - ' + str(timeit.timeit('custom_copy(nested_list)', setup = 'from __main__ import custom_copy, nested_list', number = num)))
print('[:] - ' + str(timeit.timeit('custom_slice(nested_list)', setup = 'from __main__ import custom_slice, nested_list', number = num)))
print('[*] - ' + str(timeit.timeit('custom_asterisk(nested_list)', setup = 'from __main__ import custom_asterisk, nested_list', number = num)))
print('list() - ' + str(timeit.timeit('custom_list_func(nested_list)', setup = 'from __main__ import custom_list_func, nested_list', number = num)))
print('copy.copy() - ' + str(timeit.timeit('custom_copy_dot_copy(nested_list)', setup = 'from __main__ import custom_copy_dot_copy, nested_list', number = num)))
print('copy.deepcopy() - ' + str(timeit.timeit('copy.deepcopy(nested_list)', setup = 'from __main__ import copy, nested_list', number = num)))
print('_pickle - ' + str(timeit.timeit('_pickle.loads(_pickle.dumps(nested_list))', setup = 'from __main__ import _pickle, nested_list', number = num)))
print('pickle - ' + str(timeit.timeit('pickle.loads(pickle.dumps(nested_list))', setup = 'from __main__ import pickle, nested_list', number = num)))
print('dill - ' + str(timeit.timeit('dill.loads(dill.dumps(nested_list))', setup = 'from __main__ import dill, nested_list', number = num)))
# -------------------------------------
# Below Secion - The lists which will contain all of the timeit() results for each given copy method.
dot_copy_timeit_totals_list = []
custom_slice_timeit_totals_list = []
custom_asterisk_timeit_totals_list = []
custom_list_func_timeit_totals_list = []
custom_copy_dot_copy_timeit_totals_list = []
copy_dot_deepcopy_timeit_totals_list = []
_pickle_timeit_totals_list = []
pickle_timeit_totals_list = []
# -------------------------------------
# Below Section - Iterates 1000x timeit() calls (w/ number arg == 1000) for each copy method, appending the returned value (the time of execution for each timeit() call) to each timeit_totals_list.
for x in range(1000):
dot_copy_timeit_totals_list.append(timeit.timeit('custom_copy(nested_list)', setup = 'from __main__ import custom_copy, nested_list', number = num))
custom_slice_timeit_totals_list.append(timeit.timeit('custom_slice(nested_list)', setup = 'from __main__ import custom_slice, nested_list', number = num))
custom_asterisk_timeit_totals_list.append(timeit.timeit('custom_asterisk(nested_list)', setup = 'from __main__ import custom_asterisk, nested_list', number = num))
custom_list_func_timeit_totals_list.append(timeit.timeit('custom_list_func(nested_list)', setup = 'from __main__ import custom_list_func, nested_list', number = num))
custom_copy_dot_copy_timeit_totals_list.append(timeit.timeit('custom_copy_dot_copy(nested_list)', setup = 'from __main__ import custom_copy_dot_copy, nested_list', number = num))
copy_dot_deepcopy_timeit_totals_list.append(timeit.timeit('copy.deepcopy(nested_list)', setup = 'from __main__ import copy, nested_list', number = 100))
_pickle_timeit_totals_list.append(timeit.timeit('_pickle.loads(_pickle.dumps(nested_list))', setup = 'from __main__ import _pickle, nested_list', number = num))
pickle_timeit_totals_list.append(timeit.timeit('pickle.loads(pickle.dumps(nested_list))', setup = 'from __main__ import pickle, nested_list', number = num))
# -------------------------------------
# Below Section - Prints out each of the copy method's avg. timeit() results in order from fastest to slowest.
print('.copy() - '+ str(statistics.mean(dot_copy_timeit_totals_list)))
print('[:] - '+ str(statistics.mean(custom_slice_timeit_totals_list)))
print('[*] - '+ str(statistics.mean(custom_asterisk_timeit_totals_list)))
print('list() - '+ str(statistics.mean(custom_list_func_timeit_totals_list)))
print('copy.copy() - '+ str(statistics.mean(custom_copy_dot_copy_timeit_totals_list)))
print('copy.deepcopy() - '+ str(statistics.mean(copy_dot_deepcopy_timeit_totals_list)))
print('_pickle - '+ str(statistics.mean(_pickle_timeit_totals_list)))
print('pickle - '+ str(statistics.mean(pickle_timeit_totals_list)))
Results - (Note: I ran about 20 of these tests using different timeit number args & for loop iteration amounts. The results remained mostly the same.)
copy() was almost always the fastest.
[:] was almost always 2nd place.
[asterisk] was almost always 3rd place (sometimes it beat [:]).
list() was always 4th place.
copy.copy() was always 5th place.
copy.deepcopy() was always 6th place.
_pickle was always 6th place.
pickle was always 7th place.
dill was by FAR the slowest for some reason (excluded it from 1000x iter due to slow speed.)
.copy(), [:], [asterisk] and list() were by FAR the fastest.
copy.copy() was relatively fast compared the slowest, but very slow compared to the fastest. 10x+ slower than the above.
copy.deepcopy() was extremely slow: almost 3x slower than copy.copy() & ~35x slower than the fastest group
_pickle & pickle were nearly identical in speed: ~100x slower than the fastest
dill may have had some issues with my test b/c the speeds were ~4,000x slower than .copy()! I couldn't get it to be any faster by altering test params.
Test Focus - My test focus was on finding an alternative to copy.deepcopy() for SMALL NESTED LISTS w/ CUSTOM OBJECTS which I am using in my multithreaded online card game python project. I found that copy.deepcopy() was causing severe lag in the game (which usese pygame) so I made a bunch of custom functions to mimic a (plain & simple) version of deepcopy.
Objects - I'm iterating through a list of 3 inner lists which contain some custom class objects (cards) which have only a few attributes. I'm not sure if things would perform differently if there were many more attributes or objects. I did alter things (slightly, not extremely) throughout testing concerning OBJECT amount, ATTRIBUTE amount and ATTRIBUTE types. The test results remained mostly the same.
Test Methodology -
Test Yourself: Use the code I posted and you can alter some of the objects or change the iteration amounts, etc to fit your needs.
Upvotes: 1
Reputation: 2729
Tested in jupyter notebook, python 3.8
l = list(range(10000))
%%timeit
[x for x in l]
# 175 µs ± 5.23 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%%timeit
copy.copy(l)
# 22.6 µs ± 365 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%%timeit
l[:]
# 22 µs ± 1.28 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%%timeit
list(l)
# 21.6 µs ± 558 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
So they're all the same except the list comprehension, which is far slower.
Upvotes: 1