Reputation: 1667
I have folders where approx 3000 new csv files come in on a daily basis, each containing between 50 and 2000 lines of information.
Currently, there is a process in place which picks these files up one at a time and takes each line one at a time and sends it to a stored procedure to insert the contents into a database.
This means that over the course of a day, it can struggle to get through the 3000 files before the next 3000 come in!
I'm looking to improve this process and had the following ideas
Any other ideas on how I could look at doing this? Currently it can take up to 20 seconds per file, I'd really like to improve performance on this considerably.
Upvotes: 2
Views: 2356
Reputation: 5421
Let's say that all 3000 files to be imported have 2000 rows each. That's 6 million rows per day. The bottleneck might not be at the client doing the inserts, but with the database itself. If indexes are enabled on the table(s) in question, inserts could be slow, depending upon how heavily indexed the table(s) is/are. What indications have led you to conclude that it is the database which is waiting around for something to do and that it is the import routine that is lagging behind, rather than the other way around?
Upvotes: 1
Reputation: 95761
You said
Currently, there is a process in place which picks these files up one at a time and takes each line one at a time and sends it to a stored procedure to insert the contents into a database.
(Emphasis added.)
That seems to mean one line equals one transaction.
Fix that.
I guess both of those sound like "replace your stored procedure". But the real point is to reduce the number of transactions. Either of those options would reduce the number of transactions for this process from 6 million a day (worst case) to 3000 a day.
Upvotes: 0
Reputation: 9338
You can use SQL Server native BCP utility.
More info about BCP utility can be found here: Importing and Exporting Bulk Data by Using the bcp Utility
You can also take a look at: About Bulk Import and Bulk Export Operations
Upvotes: 1
Reputation: 24388
SQL Server Bulk Insert might be just what you need
http://msdn.microsoft.com/en-us/library/ms188365.aspx
Another issue you may be seeing with all of those inserts taking a long time is every time a row is added, your table may be getting reindexed. A search like this will give lots of good articles on ways to maybe get better performance out of your current procedure http://www.google.com/search?q=sql+insert+performance
Upvotes: 3