Reputation: 29267
I would like some database/programming suggestion on a specific issue.
I have 5 different people (that live in different parts of the world) that provide me with data. This data is given to me in many variety of ways, following a standard structure layout. However it's not always harmonized, the data might have extra things that are not in the standard, so I'd like the structure to be as dynamic as possible to accommodate what the person wants to use.
These 5 data sources are then placed inside a central database I host. So basically I have 5 data sources that are formatted following a standard structure, and they are uploaded to my local database.
I want to automate the upload of this data as much as possible for the person providing the data, so I want them to upload new sets of data that are automatically inserted in my local db.
My questions are:
Upvotes: 0
Views: 1182
Reputation: 76
As others have said, EAV tables can handle dynamic structure. (be aware of performance issues on large tables)
But is it in your interest to have your database fields dictated by the client? You can't write business logic to act upon those new fields because they don't exist yet, they could be anything.
Can you force the client to conform to your model? This allows you to know the fields ahead of time and have business logic act upon the fields. It allows you to write meaningful reports as well, rather than just pivoted data dumps.
Upvotes: 0
Reputation: 51485
How should I keep the structure dynamic without having to revisit my standard layout to accommodate new fields of data, or different structure?
Basically, you pivot the normal database idea of columns and rows.
You have a data name table, which consists of the unique names of the fields of data, and an indicator to tell the import process what type of data is stored, like a date, timestamp, or integer.
You have a data table, which contains the data name id, a sequence number, the data field, and a foreign key to identifying information.
The sequence number is used to differentiate between different values of the same data name.
The data field holds every type of data possible. This would be a VARCHAR(MAX) in most databases. It's up to the upload process to convert dates and numbers to strings.
Your data table will have a foreign key to the rest of the information that identifies who the data field belongs to.
How do I make them upload data in a way that is incremental? For example they might be uploading an XML version of their data, my upload code should figure out what already exists.
The short answer is that you can't.
Your upload process has to identify duplicate data and not store it on the database.
My final and most important question. Are there better ways of going about this instead of having an upload infrastructure?
This is a hard question to answer without knowing more about the type of data you're receiving, but there is software that allows you to load databases without a lot of programming, by defining the input data structure and mapping that structure to your database tables.
Upvotes: 3
Reputation: 9943
This is a very general question, but I think I have a general answer. What I think solves your problem is to construct a new relational calculus where the properties attached to the master record are not pre-determined. Here is an example involving a phone book application.
Common method using a non-relational table:
All well and good, but what do you do if the occasional person shows up with a mobile phone, more than one mobile phone, a fax phone, etc.
Instead what you do is:
There is a one-to-many relationship between Person and Phones, and there can be any number of them from zero to a zillion. The tables are JOINed by Person_ID. You have to have business and presentation logic that enumerates the Phone_Type column (or just let it be free-form, which is not as useful but easier).
You can do that for any property, and is what relational data bases are all about. I hope this helps.
Upvotes: 1