ragardner
ragardner

Reputation: 1975

Why is this method of copying a list so much faster than these others?

I have run this list copying experiment and found one of the times to be much quicker than the others, as far as I can tell it does copy the list, unless someone can explain otherwise?

import time

def test(d):
    s = time.time()
    for i in range(20000000):
        d['b'] = d['a'].copy()
    e = time.time()
    print (".copy(): "+str(e-s))

    s = time.time()
    for i in range(20000000):
        d['b'][:] = d['a']
    e = time.time()
    print ("[:] on left: "+str(e-s))

    s = time.time()
    for i in range(20000000):
        d['b'] = d['a'][:]
    e = time.time()
    print ("[:] on right: "+str(e-s))

d = {'a': [1,2,3,4],
     'b': []}
test(d)

results:

.copy(): 4.150316476821899
[:] on left: 2.535377025604248
[:] on right: 4.140159606933594

To prove [:] on left is copying?

d['b'][:] = d['a']
del d['a'][1]
d['a'][0] = "testing"
print (d)

prints:

{'a': ['testing', 3, 4], 'b': [1, 2, 3, 4]}

Can anyone explain what's going on here?

Upvotes: 3

Views: 286

Answers (2)

Martijn Pieters
Martijn Pieters

Reputation: 1122522

By using [:] in an assignment you have expanded the left-hand side list to have the same number of elements as the right-hand list. The rest of your tests then re-use those extra elements; d['b'] is no longer empty!

In other words, for everything but the first iteration, that copy operation doesn't have to allocate more indices for the list. That makes it faster.

You need to test with an empty list on the left-hand side each time. Use the timeit module to run proper time trials. Use suitably large inputs:

>>> timeit('a = []; a[:] = b', 'from __main__ import testlist as b')
2.430507270000817
>>> timeit('a = b[:]', 'from __main__ import testlist as b')
2.5209442199993646
>>> timeit('a = b.copy()', 'from __main__ import testlist as b')
2.5766620540016447

Allocating more space doesn't actually take all that much time; you only see the effect for small lists:

>>> timeit('a[:] = b', 'a = [1, 2, 3, 4]; b = a[:]', number=10000000)
0.7304684859991539
>>> timeit('a = []; a[:] = b', 'a = [1, 2, 3, 4]; b = a[:]', number=10000000)
0.9908717719954439

(note, the iterations have been increased from 1 million to 10 million here).

For large lists, the difference is mostly moot:

>>> timeit('a[:] = b', 'from __main__ import testlist as b; a = b[:]')
2.4996446009972715
>>> timeit('a = []; a[:] = b', 'from __main__ import testlist as b')
2.4407932980029727

In this specific test run starting with an empty list was actually slightly faster; the difference is mostly noise however, re-running these two tests have the timings trade places regularly.

Upvotes: 4

kindall
kindall

Reputation: 184200

d['b'][:] = d['a'] does not need to allocate memory for the list d['b'], since it has already been allocated by the previous operation and the existing list object is being reused. The others create a new list.

Upvotes: 1

Related Questions