Reputation: 45291
I have a data input file for a particular program. There are different kinds of data lines in the file, and for each kind of input line there is a corresponding class in another module I created previously. Minimal example:
#data module
class AnalysisObject: pass
class BoundaryNode(AnalysisObject): pass
class MaterialDefiniton(AnalysisObject): pass
It seemed like it would be a good idea to assign a class to each line depending on which kind of data it represents, and to have that object's corresponding AnalysisObject
as a r/w property descriptor, so I have done that by first establishing an interface (I can't use an abc.ABCMetaclass
for reasons explained below):
#data input file module
class LineBase():
def __init__(self, line):
self.line = line
def analysis_object(self):
raise NotImplementedError()
class BoundaryNodeLine(LineBase):
@property
def analysis_object(self):
#parsing the string - just an example
#the logic for each line type is different
data = self.split()
return data[5], data[7]
@analysis_object.setter
def analysis_object(self, bnode):
self.line = "ANALYSIS NO XX BNODE NO {0} NODE {1}".format(*bnode)
class MaterialDefinitionLine(LineBase):
-similar implementation to the above-
My problem is how to assign each line of the file lines to the correct class while maintaining separation between my data input module and my data module.
My idea for this is to first load each line into a subclass of str
. This subclass can then reassign the object to the correct class (using myobject.__class__ = NewClass
) when any of its methods are called.
This all seems to be working ok, however, I'm unsure about whether reassigning an object's class in this way is a good idea. It doesn't seem to be causing any problems for me yet, but I'm not particularly experienced and am not sure what to look for.
class DataLine(LineBase):
def _switch_class(self,klass):
self.__class__ = klass
def __getattr__(self,name):
self.__class__ = line_type_detector(self.line)
return getattr(self, name)
Parenthetically: this class is the reason I can't make LineBase an abc.ABCMetaclass
with:
@abc.abstract
def analysis_object(self): etc. etc. etc.
...because then the class reassignment would never happen when DataLine().analysis_object
gets called.
Detecting which class each line should be is not a problem; I have a simple line_type_detector
for that:
def line_type_detector(data_line):
-figure out what class the line should be-
return ItShouldBeThisClass
Upvotes: 0
Views: 60
Reputation: 388023
Usually, when using different types to identify this kind of different data, the more specialized types would be expected to have additional properties or method that are special for them. For example a MaterialDefiniton
might have information about the material which the base AnalysisObject
would not. So the reason for choosing a different type would be to be able to store properties that are not common.
Now, if that’s not the case, and the types are merely for identification and not for state or behavior differences, then you just shouldn’t use different types at all. Just use one type, and have it have a property like kind
which tells which data kind you are referring to. So you would have for example obj.kind = 'materialdefinition'
, or obj.kind = 'boundarynode'
.
But I’m going to assume that this is not the case, that those objects are actually different and that—when parsing a line—you would want to fill different properties with values depending on the line type.
So, what’s wrong with the data types being able to parse such lines themselves? Nothing, I would say. It’s very common to have types that are able to serialize or unserialize themselves. And I don’t think it would be a problem of missing seperation of concerns; after all, the type is making sure that its state can be stored in a file, or loaded from a file.
In Python, you could actually just implement a type’s __repr__
method to get a string representation that can be saved to a file. repr
is meant to give you something that fully represents the state of the object. And for parsing, you can just create a function that takes a string and returns one of the objects depending on what the function parses. For example:
def parse (line):
data = line.split()
if data[0] == 'material':
obj = MaterialDefiniton()
obj.material = data[1]
obj.otherstuff = data[2]
elif data[0] == 'boundary':
obj = BoundaryNode()
obj.boundary = data[1]
else:
obj = AnalysisObject()
obj.raw_data = data
return obj
You could also move the individual parsing of the type into the type’s constructor, so you can just pass the line to the objects and they’ll build themselves. Or you create a classmethod that explicitely parses the data:
class MaterialDefinition:
@classmethod
def parse (cls, data):
obj = cls() # This is equivalent to `obj = MaterialDefinition()`
obj.material = data[1]
obj.otherstuff = data[2]
return obj
And then you can just choose which object you need in the general parse
function:
def parse (line):
data = line.split()
if data[0] == 'material':
return MaterialDefinition.parse(data)
elif data[0] == 'boundary':
return BoundaryNode.parse(data)
else:
return AnalysisObject.parse(data)
You could even move that parse
function’s logic completely into the AnalysisObject.parse
method, so that one returns whatever subtype is most appropriate.
Upvotes: 2
Reputation: 149085
I do not know how to change the class of an object once it has been constructed. So I would rather assign the class at creation time. You can do that explicitely with a function createLine
that would return an object of the appropriate subclass :
def createLine(data_line):
typ = line_type_detector
if (typ == BoundaryNodeLine):
return BoundaryNodeLine(data_line)
...
You could also use a __new__
special method in LineBase
:
class LineBase:
def __new__(cls, data_line):
return createLine(data_line)
...
where createLine would be the same as above function.
Then you could simply create your objects with :
line = LineBase(data_line)
and directly get objects of correct subclass
Upvotes: 1