Kevin Wang
Kevin Wang

Reputation: 2729

Which method to copy a flat list is faster: comprehension, slice, or copy.copy?

Say I want to make a shallow copy of a list in python. Which method is fastest?

I can think of

This question is specifically about the relative speed of making a shallow copy of a list in Python. Specifically, I am interested in CPython.

Upvotes: -1

Views: 386

Answers (2)

Joshua Merrifield
Joshua Merrifield

Reputation: 13

Using Python 3.12 on Windows 10 using the test code below, from fastest to slowest:

RESULTS:

.copy() -         0.0014513516970910132
[:] -             0.001602155597647652
[*] -             0.0016144087981665508
list() -          0.0018496818009298296
copy.copy() -     0.026636865100823342
copy.deepcopy() - 0.06005727069824934
_pickle -         0.12308293380064424
pickle -          0.12323138289933558
dill -            4.047985399956815

TEST CODE:

import random
import statistics
import copy
import timeit
import _pickle
import pickle
import dill
# -------------------------------------
rank_list = ['A', '2', '3', '4', '5', '6', '7', '8', '9', '10', 'J', 'Q', 'K', 'Jo']

class CustomList(list):
    def __init__(self, name):
        self.name = name
        super().__init__()

class Card():
    def __init__(self):
        self.x = 1
        self.y = 2
        self.id = random.randint(1, 100)
        self.rank = rank_list[random.randint(0, len(rank_list) - 1)]

    def __repr__(self):
        return repr(self.id)
# -------------------------------------
# Below Line - The outer list.
nested_list = CustomList('nested_list')
# Below Lines - The 3 inner lists.
inner_list_1 = CustomList('inner_list_1')
inner_list_2 = CustomList('inner_list_2')
inner_list_3 = CustomList('inner_list_3')
# -------------------------------------
# Below Sections - Appending random amount of cards to each inner list.
for x in range(5):
    card = Card()
    inner_list_1.append(card)

for x in range(6):
    card = Card()
    inner_list_2.append(card)

for x in range(8):
    card = Card()
    inner_list_3.append(card)
# -------------------------------------
# Below Section - Appending each inner list to the outer list.
nested_list.append(inner_list_1)
nested_list.append(inner_list_2)
nested_list.append(inner_list_3)
# -------------------------------------
# Below Function - custom .copy() deep copy function. Params - l = outer list.
def custom_copy(nested_list):
    outer_copy = []
    for inner_list in nested_list:
        inner_list_copy = inner_list.copy()
        outer_copy.append(inner_list_copy)
    return outer_copy
# -------------------------------------
# Below Function - custom [:] deep copy function. Params - l = outer list.
def custom_slice(nested_list):
    outer_copy = []
    for inner_list in nested_list:
        inner_list_copy = inner_list[:]
        outer_copy.append(inner_list_copy)
    return outer_copy
# -------------------------------------
# Below Function - custom [*] deep copy function. Params - l = outer list.
def custom_asterisk(nested_list):
    outer_copy = []
    for inner_list in nested_list:
        inner_list_copy = [*inner_list]
        outer_copy.append(inner_list_copy)
    return outer_copy
# -------------------------------------
# Below Function - custom list() deep copy function. Params - l = outer list.
def custom_list_func(nested_list):
    outer_copy = []
    for inner_list in nested_list:
        inner_list_copy = list(inner_list)
        outer_copy.append(inner_list_copy)
    return outer_copy
# -------------------------------------
# Below Function - custom copy.copy() deep copy function. Params - l = outer list.
def custom_copy_dot_copy(l):
    outer_copy = []
    for inner_list in nested_list:
        inner_list_copy = copy.copy(inner_list)
        outer_copy.append(inner_list_copy)
    return outer_copy
# -------------------------------------
# Below Line - Iteration num for timeit's number param. Change this var to change all timeit() calls.
num = 1000
# -------------------------------------
# Below Section - 1x timeit() call for each copy method.
print('.copy -           ' + str(timeit.timeit('custom_copy(nested_list)', setup = 'from __main__ import custom_copy, nested_list', number = num)))
print('[:] -             ' + str(timeit.timeit('custom_slice(nested_list)', setup = 'from __main__ import custom_slice, nested_list', number = num)))
print('[*] -             ' + str(timeit.timeit('custom_asterisk(nested_list)', setup = 'from __main__ import custom_asterisk, nested_list', number = num)))
print('list() -          ' + str(timeit.timeit('custom_list_func(nested_list)', setup = 'from __main__ import custom_list_func, nested_list', number = num)))
print('copy.copy() -     ' + str(timeit.timeit('custom_copy_dot_copy(nested_list)', setup = 'from __main__ import custom_copy_dot_copy, nested_list', number = num)))
print('copy.deepcopy() - ' + str(timeit.timeit('copy.deepcopy(nested_list)', setup = 'from __main__ import copy, nested_list', number = num)))
print('_pickle -         ' + str(timeit.timeit('_pickle.loads(_pickle.dumps(nested_list))', setup = 'from __main__ import _pickle, nested_list', number = num)))
print('pickle -          ' + str(timeit.timeit('pickle.loads(pickle.dumps(nested_list))', setup = 'from __main__ import pickle, nested_list', number = num)))
print('dill -            ' + str(timeit.timeit('dill.loads(dill.dumps(nested_list))', setup = 'from __main__ import dill, nested_list', number = num)))
# -------------------------------------
# Below Secion - The lists which will contain all of the timeit() results for each given copy method.
dot_copy_timeit_totals_list = []
custom_slice_timeit_totals_list = []
custom_asterisk_timeit_totals_list = []
custom_list_func_timeit_totals_list = []
custom_copy_dot_copy_timeit_totals_list = []
copy_dot_deepcopy_timeit_totals_list = []
_pickle_timeit_totals_list = []
pickle_timeit_totals_list = []
# -------------------------------------
# Below Section - Iterates 1000x timeit() calls (w/ number arg == 1000) for each copy method, appending the returned value (the time of execution for each timeit() call) to each timeit_totals_list.
for x in range(1000):
    dot_copy_timeit_totals_list.append(timeit.timeit('custom_copy(nested_list)', setup = 'from __main__ import custom_copy, nested_list', number = num))
    custom_slice_timeit_totals_list.append(timeit.timeit('custom_slice(nested_list)', setup = 'from __main__ import custom_slice, nested_list', number = num))
    custom_asterisk_timeit_totals_list.append(timeit.timeit('custom_asterisk(nested_list)', setup = 'from __main__ import custom_asterisk, nested_list', number = num))
    custom_list_func_timeit_totals_list.append(timeit.timeit('custom_list_func(nested_list)', setup = 'from __main__ import custom_list_func, nested_list', number = num))
    custom_copy_dot_copy_timeit_totals_list.append(timeit.timeit('custom_copy_dot_copy(nested_list)', setup = 'from __main__ import custom_copy_dot_copy, nested_list', number = num))
    copy_dot_deepcopy_timeit_totals_list.append(timeit.timeit('copy.deepcopy(nested_list)', setup = 'from __main__ import copy, nested_list', number = 100))
    _pickle_timeit_totals_list.append(timeit.timeit('_pickle.loads(_pickle.dumps(nested_list))', setup = 'from __main__ import _pickle, nested_list', number = num))
    pickle_timeit_totals_list.append(timeit.timeit('pickle.loads(pickle.dumps(nested_list))', setup = 'from __main__ import pickle, nested_list', number = num))
# -------------------------------------
# Below Section - Prints out each of the copy method's avg. timeit() results in order from fastest to slowest.
print('.copy() -         '+ str(statistics.mean(dot_copy_timeit_totals_list)))
print('[:] -             '+ str(statistics.mean(custom_slice_timeit_totals_list)))
print('[*] -             '+ str(statistics.mean(custom_asterisk_timeit_totals_list)))
print('list() -          '+ str(statistics.mean(custom_list_func_timeit_totals_list)))
print('copy.copy() -     '+ str(statistics.mean(custom_copy_dot_copy_timeit_totals_list)))
print('copy.deepcopy() - '+ str(statistics.mean(copy_dot_deepcopy_timeit_totals_list)))
print('_pickle -         '+ str(statistics.mean(_pickle_timeit_totals_list)))
print('pickle -          '+ str(statistics.mean(pickle_timeit_totals_list)))

Results - (Note: I ran about 20 of these tests using different timeit number args & for loop iteration amounts. The results remained mostly the same.)

  • copy() was almost always the fastest.

  • [:] was almost always 2nd place.

  • [asterisk] was almost always 3rd place (sometimes it beat [:]).

  • list() was always 4th place.

  • copy.copy() was always 5th place.

  • copy.deepcopy() was always 6th place.

  • _pickle was always 6th place.

  • pickle was always 7th place.

  • dill was by FAR the slowest for some reason (excluded it from 1000x iter due to slow speed.)

  • .copy(), [:], [asterisk] and list() were by FAR the fastest.

  • copy.copy() was relatively fast compared the slowest, but very slow compared to the fastest. 10x+ slower than the above.

  • copy.deepcopy() was extremely slow: almost 3x slower than copy.copy() & ~35x slower than the fastest group

  • _pickle & pickle were nearly identical in speed: ~100x slower than the fastest

  • dill may have had some issues with my test b/c the speeds were ~4,000x slower than .copy()! I couldn't get it to be any faster by altering test params.

Test Focus - My test focus was on finding an alternative to copy.deepcopy() for SMALL NESTED LISTS w/ CUSTOM OBJECTS which I am using in my multithreaded online card game python project. I found that copy.deepcopy() was causing severe lag in the game (which usese pygame) so I made a bunch of custom functions to mimic a (plain & simple) version of deepcopy.

Objects - I'm iterating through a list of 3 inner lists which contain some custom class objects (cards) which have only a few attributes. I'm not sure if things would perform differently if there were many more attributes or objects. I did alter things (slightly, not extremely) throughout testing concerning OBJECT amount, ATTRIBUTE amount and ATTRIBUTE types. The test results remained mostly the same.

Test Methodology -

  1. Created custom list and card classes
  2. Created an empty nested_list & 3x empty inner_lists
  3. Appended cards to the inner_lists
  4. Appended the inner lists the outer list
  5. Created custom functions which mimic a simple deepcopy using the various forms of shallow copy (did this for all methods except .deepcopy(), _pickle, pickle, & dill)
  6. Created empty timeit_total_lists to contain the timeit totals for each copying method
  7. Using a for loop, iterated 1,000 times through a timeit() call for each copying method (with timeit's number param set to 1,000 as well) (resulting in 1,000,000 total iterations), appending the timeit results to each timeit_totals_list during each iteration
  8. Note: Instead of doing 1,000,000 iterations INSIDE of timeit() via the number arg == 1,000,000, I opted to do 1,000 iterations of timeit() via a for loop w/ the number arg == 1,000 because it seemed to skew the results heavily if the timeit()'s number arg was > 1,000 (for some unknown reason)
  9. Printed the avg/mean using statistics.mean for each copy method's associated avg_list (in order of performance)
  10. Ran the test script ~20x

Test Yourself: Use the code I posted and you can alter some of the objects or change the iteration amounts, etc to fit your needs.

Upvotes: 1

Kevin Wang
Kevin Wang

Reputation: 2729

Tested in jupyter notebook, python 3.8

l = list(range(10000))
%%timeit
[x for x in l]
# 175 µs ± 5.23 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%%timeit
copy.copy(l)
# 22.6 µs ± 365 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%%timeit
l[:]
# 22 µs ± 1.28 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%%timeit
list(l)
# 21.6 µs ± 558 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

So they're all the same except the list comprehension, which is far slower.

Upvotes: 1

Related Questions