Reputation: 318
I am writing an internal application and one of the functions will be importing data from a remote system. The data from the remote system comes over as a CSV file. I need to compare the data in my system with that of the CSV file.
I need to apply any changes to my system (Adds and Changes). I need to track each field that is changed.
My database is normalized so I'm dealing with about 10 tables to correspond with the data in the CSV file. What is the best way to implement this? Each CSV file has about 500,000 records that are processed daily. I started by querying row by row from my SQL database using a lookup ID then using c# do do a field by field compare and updating or inserting as necessary; however, this takes way too long.
Any suggestions?
Upvotes: 1
Views: 3608
Reputation: 92785
You can do following:
Using that approach you can implement almost all clean-up, validation, and update logic using your RDBMS functionality.
If your RDBMS is SQL Server you can leverage SQL Server Integration Services.
Upvotes: 3
Reputation: 150108
If you have anything that serves as a unique key, you can do the following:
Upvotes: 2
Reputation: 2473
Expanding on the first comment to your question.
Create an appropriately indexed table that matches the format of your csv file and dump the data straight into it.
Have a stored procedure with appropriate queries to update/delete/insert to the active tables.
Get rid of the temporary table.
Upvotes: 0