Reputation: 463
I am trying to do a bulk insert from an XML file (parsed and analyzed). I am using Entity Framework 6.1.3 in VB.net.
The processing of the XML file (not saving) took about 6.5 minutes to run. By Googling I found
ctx.Configuration.AutoDetectChangesEnabled = False
which brought the processing time down to 15 seconds! Hooray!
The problem is, when I get to ctx.SaveChanges()
it takes 20+ minutes to write the data to the database! It is just flooding the SQL server with single Insert statements. In an effort to make this more efficient, I want to do a BulkInsert. I came across a couple different options including SQLBulkCopy and what I am currently using EFUtilities. My code is as follows:
Using ctx As New MyContext()
For Each dp As XElement In de.Elements
Party = New DistinctParty()
With Party
.Comment = dp.<d:Comment>.Value
.ListID = ListType
End With
ctx.DistinctParties.Add(Party)
AddAliases(dp.<d:Profile>.<d:Identity>.FirstOrDefault(), ctx)
AddFeatures(dp.<d:Profile>.<d:Feature>, ctx)
Next
' Save the items
Dim DPs As List(Of Integer)
DPs = ctx.DistinctParties.Where(Function(x) x.ListID = ListType).Select(Function(x) x.DistinctPartyID).ToList
EFBatchOperation.For(ctx, ctx.DistinctParties).InsertAll(ctx.DistinctParties.Where(Function(x) x.ListID = ListType))
EFBatchOperation.For(ctx, ctx.DistinctPartyAliases).InsertAll(ctx.DistinctPartyAliases.Where(Function(x) DPs.Contains(x.DistinctPartyID)))
EFBatchOperation.For(ctx, ctx.DistinctPartyFeatures).InsertAll(ctx.DistinctPartyFeatures.Where(Function(x) DPs.Contains(x.DistinctPartyID)))
'ctx.SaveChanges()
End Using
ctx.SaveChanges()
works fine with this. It just takes 20 minutes.
Problem is, DPs.Count
is 0! If I had old record in the database before I delete them earlier in the code, DPs is the count of the data that WAS in the database.
QUESTION: Why is DPs.Count = 0 when there is really data in the context for the 3 different tables?
RELATED QUESTION: Is this the correct way to do the bulk insert?
Upvotes: 1
Views: 167
Reputation: 11347
QUESTION: Why is DPs.Count = 0 when there is really data in the context for the 3 different tables?
When you use LINQ Queryable, the results (count, first, etc.) are taken from what's currently in the database and doesn't care about what's in context/change tracker which is similar to a list of "pending" entities waiting to be committed/saved to the database.
So this is why the count equal 0 if the database has 0 records even if you are adding thousands of entities in the context (which have not been saved yet)
RELATED QUESTION: Is this the correct way to do the bulk insert?
There are three main libraries supporting Bulk Insert, you can read my Bulk Insert Library Reviews & Comparisons
In short, I recommend Entity Framework Extensions over other libraries since it's the only one really supported and has way more flexibility than other libraries: Bulk Insert, Update, Delete, Merge and BulkSaveChanges.
Disclaimer: I'm the owner of the project Entity Framework Extensions
Upvotes: 2