Reputation: 3127
We have business users who are entering product information into excel spreadsheets. I have been tasked with coming up with a way of entering this information into our SQL Server DB. The problem is that the excel spreadsheets aren't just a flat table, they're hierarchical. They're something like this
-[Product 1] [Other fields]...
-[Maintenance item 1] [Other fields]...
-[Maintenance task 1] [other fields]...
-[Maintenance item 2] [Other fields]...
-[Maintenance task 2] [other fields]...
-[Maintenance task 3] [other fields]...
-[Product 2] [Product Description] [Other fields]...
ETC.......
So there can be 0-many maintenance items for a product and 0-many maintenance tasks for a maintenance items. This is how the database is structured. I need to come up with a standard excel template I can send out to our business users so they can input this information and then figure out how to export this into sql server. The volume is going to be high so I need to have the import somewhat automated. How should I do this?
Upvotes: 3
Views: 3711
Reputation: 23505
I would add VBA code to the template to add as much structure and intelligence as possible to the user data entry and validation.
In the extreme case of this you make the user enter all data via Forms which put all the validated data on the sheet, and then have an overall validation routine built into the Save or Close event.
less extreme would be to add 3 command buttons driving code for
- add product
- add maintenance item
- add maintenance task
and some overall validation code at save/close
This way you add as much smarts as possible to the data entry tasks.
Use Named Cells or other hidden metadata created by the VBA code as markers so that your DB update routine can make better sense of the data.
The last one I did like this took 3-4 manweeks including the DB update routines, but I think it was probably more complicated than your example.
But if you are not experienced with VBA and the Excel object model and events it would obviously take much longer.
Upvotes: 1
Reputation: 16677
I agree with previous posts in general...
my suggestion - avoid the spreadsheet entirely. Spend your time making a simple front end form - preferably a web based one. catch the data as cleanly as possible (ANYTHING here will be better than the spreadsheet cleanliness. - including just having named fields)
you will spend less time in the end.
Upvotes: 1
Reputation: 50970
I don't believe you'll find an import tool that will do this for you. Instead, you're going to have to write a script to ETL the spreadsheet files. I do a lot of this in Python (I'm doing it today, in fact).
Make sure that you handle exceptions on per-cell level, reporting to the user exactly which cell had unexpected information. With spreadsheets created by hand it is guaranteed that you will have to handle this on a regular basis.
That said, if this is coming to you as XLSX it might be possible to develop an XML translation to convert it to some more tractable XML document.
Upvotes: 2
Reputation: 96552
Welcome to the worst possible way to store data and try to import it into a database. If at all possible do not let them create garbage Excel spreadsheets like that. That method is bound to create very many bugs in the data imports and you will hate your life forever if you have to support this mess.
I can't believe I'm even suggesting this, but can you get them to use a simple Access database instead? It could even link directly to the SQL server database and store the data correctly. By using Access forms, the users will find it relatively easy to add and maintain information and you will have far fewer problems than trying to import Excel data in the form you described. It would be a far less expensive and far less error prone solution to your problem.
You are stuck with the format, the best way I have found to do something like ths is to import it as is into a staging table add the ids to every subordinate row (you may end up looping to do this) and then drag the information out to relational staging tables and then import into the production database.
You can create all this using SSIS but it won't be easy, it won't quick and it will be very prone to bugs if users aren't disciplined abnout exactly how they enter data (and they never are without a set of forms to fill out). Make sure you reject the Excel spreadsheet completely and send it back to the user if it strays at all from the prescribed struture. Trust me on this.
I's estimate the Access solution to take about a month and the Excel solution to take at least six months of development. Really that's how bad this is going to be.
Upvotes: 7
Reputation: 57748
It probably makes more sense to break it up into several Excel sheets...one for product, but then another for maintenance items, and another for maintenance tasks. For each one, they'll have to enter some kind of ID to link them back together (ex: maintenance_task_id=1 links to maintenance_item_id=4). That can be a pain for business users to remember, but the only alternative is to enter lots of redundant data for each line.
Next, create a normalized database model (to avoid storing redundant data) and fill it by writing an app or script to parse-through your Excel sheets. Vague and high-level, but that's how I'd do it.
Upvotes: 1