Reputation: 1162
I'm doing some work that involves inserting a batch of records into a Sql database. The size of the batch will vary but for arguments sake we can say 5000 records every 5 secs. It is likely to be less though. Multiple processes will be writing to this table, nothing is reading from it.
What I have noticed during a quick test is that using a SqlTransaction around this whole batch insert seems to improve performance.
e.g.
SqlTransaction trans = Connection.BeginTransaction()
myStoredProc.Transaction = trans;
sampleData.ForEach(ExecuteNonQueryAgainstDB);
transaction.Commit();
I'm not interested in having the ability to rollback my changes so I wouldn't have really considered using a transaction except it seems to improve performance. If I remove this Transaction code my inserts go from taking 300ms to around 800ms!
What is the logic for this? Because my understanding is the transaction still writes the data to the DB but locks the records until it is committed. I would have expected this to have an overhead...
What I am looking for is the fastest way to do this insert.
Upvotes: 4
Views: 6668
Reputation: 34238
I've just finished writing a blog post on the performance gains you can get by explicitly specifying where transactions start and finish.
With Dapper i have observed transactions cutting batch insert down to 1/2 the original time and batch update times down to 1/3 of the original time
Upvotes: 1
Reputation: 52123
What you're getting is perfectly normal.
If your working with a usual isolation level (let's say commited or snapshot) then when you don't use transactions the database engine has to check for conflicts every time you make an insert. That is, it has to make sure that whenever someone reads from that table (with a SELECT *
) for example, it doesn't get dirty reads, that is, mantain the insertion so that while the insertion itself it's taking place noone else is reading.
That will mean, lock, insert row, unlock, lock, insert row, unlock and so on.
When you encapsulate all that in a transaction what you're effectively achieving is reducing that series of "lock" and "unlock" into just one in the commit phase.
Upvotes: 3
Reputation: 174299
The commit is what costs time. Without your explicit transaction, you have one transaction per query executed. With the explicit transaction, no additional transaction is created for your queries. So, you have one transaction vs. multiple transactions. That's where the performance improvement comes from.
Upvotes: 9
Reputation: 166376
If you are looking for a fast wqay to insert/load data have a look at SqlBulkCopy Class
Upvotes: 5