Rick
Rick

Reputation: 45291

Manage an object by switching its class

I have a data input file for a particular program. There are different kinds of data lines in the file, and for each kind of input line there is a corresponding class in another module I created previously. Minimal example:

#data module
class AnalysisObject: pass
class BoundaryNode(AnalysisObject): pass
class MaterialDefiniton(AnalysisObject): pass

It seemed like it would be a good idea to assign a class to each line depending on which kind of data it represents, and to have that object's corresponding AnalysisObject as a r/w property descriptor, so I have done that by first establishing an interface (I can't use an abc.ABCMetaclass for reasons explained below):

#data input file module

class LineBase():
    def __init__(self, line):
        self.line = line
    def analysis_object(self):
        raise NotImplementedError()

class BoundaryNodeLine(LineBase):
    @property
    def analysis_object(self):
        #parsing the string - just an example
        #the logic for each line type is different
        data = self.split()
        return data[5], data[7]
    @analysis_object.setter
    def analysis_object(self, bnode):
        self.line = "ANALYSIS NO XX BNODE NO {0} NODE {1}".format(*bnode)

class MaterialDefinitionLine(LineBase):
    -similar implementation to the above-

My problem is how to assign each line of the file lines to the correct class while maintaining separation between my data input module and my data module.

My idea for this is to first load each line into a subclass of str. This subclass can then reassign the object to the correct class (using myobject.__class__ = NewClass) when any of its methods are called.

This all seems to be working ok, however, I'm unsure about whether reassigning an object's class in this way is a good idea. It doesn't seem to be causing any problems for me yet, but I'm not particularly experienced and am not sure what to look for.

class DataLine(LineBase):
    def _switch_class(self,klass):
        self.__class__ = klass
    def __getattr__(self,name):
        self.__class__ = line_type_detector(self.line)
        return getattr(self, name)

Parenthetically: this class is the reason I can't make LineBase an abc.ABCMetaclass with:

@abc.abstract 
def analysis_object(self): etc. etc. etc.

...because then the class reassignment would never happen when DataLine().analysis_object gets called.

Detecting which class each line should be is not a problem; I have a simple line_type_detector for that:

def line_type_detector(data_line):
    -figure out what class the line should be-
    return ItShouldBeThisClass

Upvotes: 0

Views: 60

Answers (2)

poke
poke

Reputation: 388023

Usually, when using different types to identify this kind of different data, the more specialized types would be expected to have additional properties or method that are special for them. For example a MaterialDefiniton might have information about the material which the base AnalysisObject would not. So the reason for choosing a different type would be to be able to store properties that are not common.

Now, if that’s not the case, and the types are merely for identification and not for state or behavior differences, then you just shouldn’t use different types at all. Just use one type, and have it have a property like kind which tells which data kind you are referring to. So you would have for example obj.kind = 'materialdefinition', or obj.kind = 'boundarynode'.

But I’m going to assume that this is not the case, that those objects are actually different and that—when parsing a line—you would want to fill different properties with values depending on the line type.

So, what’s wrong with the data types being able to parse such lines themselves? Nothing, I would say. It’s very common to have types that are able to serialize or unserialize themselves. And I don’t think it would be a problem of missing seperation of concerns; after all, the type is making sure that its state can be stored in a file, or loaded from a file.

In Python, you could actually just implement a type’s __repr__ method to get a string representation that can be saved to a file. repr is meant to give you something that fully represents the state of the object. And for parsing, you can just create a function that takes a string and returns one of the objects depending on what the function parses. For example:

def parse (line):
    data = line.split()
    if data[0] == 'material':
        obj = MaterialDefiniton()
        obj.material = data[1]
        obj.otherstuff = data[2]
    elif data[0] == 'boundary':
        obj = BoundaryNode()
        obj.boundary = data[1]
    else:
        obj = AnalysisObject()

    obj.raw_data = data
    return obj

You could also move the individual parsing of the type into the type’s constructor, so you can just pass the line to the objects and they’ll build themselves. Or you create a classmethod that explicitely parses the data:

class MaterialDefinition:
    @classmethod
    def parse (cls, data):
        obj = cls() # This is equivalent to `obj = MaterialDefinition()`
        obj.material = data[1]
        obj.otherstuff = data[2]
        return obj

And then you can just choose which object you need in the general parse function:

def parse (line):
    data = line.split()
    if data[0] == 'material':
        return MaterialDefinition.parse(data)
    elif data[0] == 'boundary':
        return BoundaryNode.parse(data)
    else:
        return AnalysisObject.parse(data)

You could even move that parse function’s logic completely into the AnalysisObject.parse method, so that one returns whatever subtype is most appropriate.

Upvotes: 2

Serge Ballesta
Serge Ballesta

Reputation: 149085

I do not know how to change the class of an object once it has been constructed. So I would rather assign the class at creation time. You can do that explicitely with a function createLine that would return an object of the appropriate subclass :

def createLine(data_line):
    typ = line_type_detector
    if (typ == BoundaryNodeLine):
        return BoundaryNodeLine(data_line)
    ...

You could also use a __new__ special method in LineBase :

class LineBase:
    def __new__(cls, data_line):
        return createLine(data_line)
    ...

where createLine would be the same as above function.

Then you could simply create your objects with :

line = LineBase(data_line)

and directly get objects of correct subclass

Upvotes: 1

Related Questions