Reputation: 4130
I want to pull out some data from a pile of XML files. I have a working parser/extractor, but I couldn't get it to sit nicely in a DB.
I was trying a very flat simple table to hold all my data, and it was too complex to pull my elements back together.
Having looked back at what I am trying to do, I built a MySQL datamodel that seems to fit the bill. It comprises a few tables, so the next task is to build a method that will put the extracted data into the appropriate table (having checked for dup values etc)
I planned to write a generic class for each lump of data, that will take the data object I present it, go to the appropriate table, check if it already exists, if its doesn't it should add it too the table, and write the key value in a 2nd table. If it does exist it simple has to pull back the key value, and write that to the 2nd table.
I'm not sure how to describe this in psuedo notation, but does this seem like a sensible approach? The alternative seems to be to write a specific connector/checker/updater for every data lump (and by lump I mean either one or n pieces of specific labelled data that have an appropriate home in one table).
Upvotes: 0
Views: 118
Reputation: 11578
Do you use any ORM with that? If not, this is good idea.
General approach is ok, but try to implement it using some generic classes. Ie. your implementation can be similar to:
class NodeSaver(object):
def __init__(self, node):
self.node = node
def save(self, connection=default_connection):
object = self.get_or_insert_to_first_table()
self.insert_to_second_table(object)
def get_or_insert_to_first_table(self):
search_values = self.get_search_values()
main_table = self.get_main_table()
objects = main_table.objects.filter(**search_values) # notation from Django ORM
if objects.exists():
return objects[0]
else:
insert_values = {}
insert_values.update(search_valuse)
insert_values.update(self.get_insert_values())
return main_table.objects.create(**insert_values)
def get_or_insert_to_second_table(self):
...
def get_main_table(self):
return self.main_table
def get_second_table(self):
return self.second_table
class MyDataLumpSaver(NodeSaver):
main_table = models.MyData
second_table = models.OtherData
def get_search_values(self):
#
def get_insert_values(self):
#
Having classes like this yo can extend those for your data lumps by overriding some methods. If you like the idea, look at Django Class Based Views. They are written in that approach.
Upvotes: 1