Reputation: 1767
I'm looking to make a deep copy of a class object that contains a list of class objects, each with their own set of stuff. The objects don't contain anything more exciting than ints and lists (no dicts, no generators waiting to yield, etc). I'm performing the deep copy on between 500-800 objects a loop, and it's really slowing the program down. I realize that this is already inefficient; it currently can't be changed.
Example of how it looks:
import random
import copy
class Base:
def __init__(self, minimum, maximum, length):
self.minimum = minimum
self.maximum = maximum
self.numbers = [random.randint(minimum, maximum) for _ in range(length)]
# etc
class Next:
def __init__(self, minimum, maximum, length, quantity):
self.minimum = minimum
self.maximum = maximum
self.bases = [Base(minimum, maximum, length) for _ in range(quantity)]
# etc
Because of the actions I'm performing on the objects, I can't shallow copy. I need the contents to be owned by the new variable:
> first = Next(0, 10, 5, 10)
> second = first
> first.bases[0].numbers[1] = 4
> print(first.bases[0].numbers)
> [2, 4, 3, 3, 8]
> print(second.bases[0].numbers)
> [2, 4, 3, 3, 8]
>
> first = Next(0, 10, 5, 10)
> second = copy.deepcopy(first)
> first.bases[0].numbers[1] = 4
> print(first.bases[0].numbers)
> [8, 4, 7, 9, 9]
> print(second.bases[0].numbers)
> [8, 11, 7, 9, 9]
I've tried a couple of different ways, such as using json to serialize and reload the data, but in my tests it's not been nearly fast enough, because I'm stuck reassigning all of the variables each time. My attempt at pulling off a clever self.__dict__ = dct
hasn't worked because of the nested objects.
Any ideas for how to efficiently deep copy multiply-nested Python objects without using copy.deepcopy?
Upvotes: 6
Views: 10261
Reputation: 1767
Based on cherish's answers here, pickle.loads(pickle.dumps(first))
works about twice as fast per call. I had written it off initially because of an unrelated error when testing it, but on retesting it, it performs well within my needs.
Upvotes: 9
Reputation: 21453
One of the first things that copy.deepcopy
looks for is if the object defines it's own __deepcopy__
method So instead of having it figure out how to copy object every time just define your own process.
It would require you have a way of defining a Base
object without any element of random-ness for the copies to use, but if you can find a more efficient process of copying your objects you should define it as a __deepcopy__
method to speed up copying process.
Upvotes: 5