planetmaker
planetmaker

Reputation: 6044

Deepcopy does not result in unique objects

I have a programme which generates a list of somewhat simple dictionary objects like

class loginfo:
    info = {
        'global_time': -1,
        'exptime': 0,
        'type': ""
        # omitted other entries for brevity
        }
    def assign(self, d):
        if d is not None:
            self.info.update((key, d[key]) for key in d if key in self.info)
    def __str__(self):
        s = str(self.info['global_time'])
        for key in self.info:
            if key == 'global_time':
                continue
            s += ",\t"
            s += str(self.info[key])
        s += "\n"
        return(s)

I read lines from a somewhat weired logfile, parse them in a separate routine parse_linetype which returns a single loginfo object. I want to create a list of these objects for further processing before I write the result to a nicer logfile:

from copy import deepcopy
output_log = list()
version = -1
global_time = 0
with open(path+in_filename, 'r', errors="ignore") as f:
    for i, line in enumerate(f):
        log_info = parse_linetype(line, version, global_time)
        output_log.append(deepcopy(log_info))
        # debug print:
        print(i,": ",line, len(output_log), "\n", str(output_log[len(output_log)-1]), str(output_log[0]))

Yet this code - to me unexpectedly - returns a list of identical objects even though I deepcopy the object already on appending the list. The objects are returned properly from the parse function - and as an added benefit actually - the info not available in the parsed log line are carried over from the previous log line(s). But that carry-over should NOT influence what is stored in the objects stored in output_log.

Where do I go wrong with creating a real, the deepcopy, where do I keep using the reference and why?

Typical output of the debugging print looks currently like this (and see that the parsed lines and returned objects do differ - but always all objects in the output_log are identical. Here the 'AIN' and 'OTHER' are the changed parts in the records - but the first record (thus 2nd line for comparison) should never change of course):

6288 :  [10:03:04.546] AIN, 57668, 970, Device handling terminates
 6289 
-1,  56797, AIN

 -1,     56797, AIN


6289 :  57670: Terminating experiment loop
 6290
 -1,     56797, OTHER

 -1,     56797, OTHER

Upvotes: 0

Views: 59

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1122312

Deepcopy doesn't copy classes, only instances. It makes the assumption that any shared class resources are designed to be shared, and you made your info dictionary a class attribute.

This is what the copy module documentation says about this:

This module does not copy types like module, method, stack trace, stack frame, file, socket, window, array, or any similar types. It does “copy” functions and classes (shallow and deeply), by returning the original object unchanged; this is compatible with the way these are treated by the pickle module.

(Bold emphasis mine).

Move the dictionary to an instance attribute instead:

class loginfo:
    def __init__(self):
        self.info = {
            'global_time': -1,
            'exptime': 0,
            'type': ""
            # omitted other entries for brevity
        }

    # ...
    # assign and __str__ unchanged.

Upvotes: 2

Related Questions