Sherwin
Sherwin

Reputation: 25

What is a faster way to deep copy a nested list without using .deepcopy()?

I have two variables with the following structures:

# initialization
    O1 = [[] for g in range(n)] 
    for i in range(n):
        O1[i] = [[] for g in range(m)] 
         
    O2 = [[] for g in range(n)] 
    for i in range(n):
        O2[i] = [[] for g in range(m)] 

Then, I input these two into an initial function where different values are appended to them as:

for i in range(n):
    for j in range(m):
        O1[i][j].append([s, as, pat, 0])
        O2[i][j].append([[ol, oa, os, ot],[dl, da, ds],tpat, tp, 0])

As you can see, the internal structure of the lists gets complicated, especially in the case of O2. After this value assignment to O1 and O2, they are input into several other functions. Each function needs to make a deep copy of O1 and O2 and modify the copy for its own use without changing the original O1 and O2 variables. The changes include both .remove() and .append() and also +/- of values in the internal lists. The important thing is that no matter the changes, the original O1 and O2 should not change at all. This process runs iteratively where first O1 and O2 are assigned new values and then input into several other functions and copied and edited without any change in the original O1 and O2 that were used as input.

Using .deepcopy() is the only way that I know but as this is an iterative process, .deepcopy() function slows the code significantly down especially when O1 and O2 are large. I have tried using tuples as an immutable data structure but given the initial and internal structure and the changes made in the functions, it does not work. using tuples prevent the append() and remove() changes but not +/- operations.

I would be grateful if someone could suggest a faster way to do this.

Upvotes: -1

Views: 111

Answers (2)

Daweo
Daweo

Reputation: 36285

If this is not verboten consider using PyPy, I prepared following simple rig

import copy
import itertools
import time

n = 500
cnt = itertools.count(1)
structure = [[[next(cnt) for _ in range(n)] for _ in range(n)] for _ in range(n)]

t1 = time.time()
structure_copy = copy.deepcopy(structure)
t2 = time.time()

print('Elapsed time (seconds) %.03f' % (t2-t1))

it does create 3D structure of shape n x n x n with subsequent integer numbers and then I measured time required to deepcopy it, got following results

  • Python 3.10.12 Elapsed time (seconds) 53.793
  • PyPy 7.3.9 with GCC 11.3.0 Elapsed time (seconds) 11.703

That is 5x times faster without any change required in code. Note that these are not most recent versions, so you might get different result. Please test this approach if you are allowed to install pypy3 and write if it give any speedup.

Upvotes: 0

MuhammadHassnain
MuhammadHassnain

Reputation: 31

There can be multiple solutions:

  1. Use numpy arrays instead of normal python lists, there memory layout allows for faster operations.
  2. Use copy-on-write techniques. You can use a library like pyrsistent (https://github.com/tobgu/pyrsistent) that delays copying until modifications are actually made, so essentially on the parts that are modified will be copied.

Upvotes: 0

Related Questions