Manage an object by switching its class

Question

I have a data input file for a particular program. There are different kinds of data lines in the file, and for each kind of input line there is a corresponding class in another module I created previously. Minimal example:

#data module
class AnalysisObject: pass
class BoundaryNode(AnalysisObject): pass
class MaterialDefiniton(AnalysisObject): pass

It seemed like it would be a good idea to assign a class to each line depending on which kind of data it represents, and to have that object's corresponding AnalysisObject as a r/w property descriptor, so I have done that by first establishing an interface (I can't use an abc.ABCMetaclass for reasons explained below):

#data input file module

class LineBase():
    def __init__(self, line):
        self.line = line
    def analysis_object(self):
        raise NotImplementedError()

class BoundaryNodeLine(LineBase):
    @property
    def analysis_object(self):
        #parsing the string - just an example
        #the logic for each line type is different
        data = self.split()
        return data[5], data[7]
    @analysis_object.setter
    def analysis_object(self, bnode):
        self.line = "ANALYSIS NO XX BNODE NO {0} NODE {1}".format(*bnode)

class MaterialDefinitionLine(LineBase):
    -similar implementation to the above-

My problem is how to assign each line of the file lines to the correct class while maintaining separation between my data input module and my data module.

My idea for this is to first load each line into a subclass of str. This subclass can then reassign the object to the correct class (using myobject.__class__ = NewClass) when any of its methods are called.

This all seems to be working ok, however, I'm unsure about whether reassigning an object's class in this way is a good idea. It doesn't seem to be causing any problems for me yet, but I'm not particularly experienced and am not sure what to look for.

class DataLine(LineBase):
    def _switch_class(self,klass):
        self.__class__ = klass
    def __getattr__(self,name):
        self.__class__ = line_type_detector(self.line)
        return getattr(self, name)

Parenthetically: this class is the reason I can't make LineBase an abc.ABCMetaclass with:

@abc.abstract 
def analysis_object(self): etc. etc. etc.

...because then the class reassignment would never happen when DataLine().analysis_object gets called.

Detecting which class each line should be is not a problem; I have a simple line_type_detector for that:

def line_type_detector(data_line):
    -figure out what class the line should be-
    return ItShouldBeThisClass

poke · Accepted Answer

Usually, when using different types to identify this kind of different data, the more specialized types would be expected to have additional properties or method that are special for them. For example a MaterialDefiniton might have information about the material which the base AnalysisObject would not. So the reason for choosing a different type would be to be able to store properties that are not common.

Now, if that’s not the case, and the types are merely for identification and not for state or behavior differences, then you just shouldn’t use different types at all. Just use one type, and have it have a property like kind which tells which data kind you are referring to. So you would have for example obj.kind = 'materialdefinition', or obj.kind = 'boundarynode'.

But I’m going to assume that this is not the case, that those objects are actually different and that—when parsing a line—you would want to fill different properties with values depending on the line type.

So, what’s wrong with the data types being able to parse such lines themselves? Nothing, I would say. It’s very common to have types that are able to serialize or unserialize themselves. And I don’t think it would be a problem of missing seperation of concerns; after all, the type is making sure that its state can be stored in a file, or loaded from a file.

In Python, you could actually just implement a type’s __repr__ method to get a string representation that can be saved to a file. repr is meant to give you something that fully represents the state of the object. And for parsing, you can just create a function that takes a string and returns one of the objects depending on what the function parses. For example:

def parse (line):
    data = line.split()
    if data[0] == 'material':
        obj = MaterialDefiniton()
        obj.material = data[1]
        obj.otherstuff = data[2]
    elif data[0] == 'boundary':
        obj = BoundaryNode()
        obj.boundary = data[1]
    else:
        obj = AnalysisObject()

    obj.raw_data = data
    return obj

You could also move the individual parsing of the type into the type’s constructor, so you can just pass the line to the objects and they’ll build themselves. Or you create a classmethod that explicitely parses the data:

class MaterialDefinition:
    @classmethod
    def parse (cls, data):
        obj = cls() # This is equivalent to `obj = MaterialDefinition()`
        obj.material = data[1]
        obj.otherstuff = data[2]
        return obj

And then you can just choose which object you need in the general parse function:

def parse (line):
    data = line.split()
    if data[0] == 'material':
        return MaterialDefinition.parse(data)
    elif data[0] == 'boundary':
        return BoundaryNode.parse(data)
    else:
        return AnalysisObject.parse(data)

You could even move that parse function’s logic completely into the AnalysisObject.parse method, so that one returns whatever subtype is most appropriate.

Manage an object by switching its class

Answers (2)

Related Questions