Reputation: 30855
(A) This version is slow... duration is measured in multiple minutes
AA_Words_100
is a simple SQL Server table which is added to the EF designerDB.AA_Words_100.Add
is called ~3,000 times (confirmed via debugging with counter variables)XCampaign
is a Dictionary<string, Dictionary<string, _Word>>
where _Word is a trivial non-EF object.foreach (var XCampaign in Words100)
foreach (var KVP in XCampaign.Value)
DB.AA_Words_100.Add(KVP.Value.To_AA_Word_100());
DB.SaveChanges();
(B) This version is fast... - .Add()
is simply commented out to narrow the scope
var iTemp = 0;
foreach (var XCampaign in Words100)
foreach (var KVP in XCampaign.Value)
iTemp++;
DB.SaveChanges();
(C) This version is fast. I simply fill up a List before calling DB.AddRange(...)
var LIST_WordsToAdd = new List<AA_Words_100>();
foreach (var XCampaign in Words100)
{
foreach (var KVP in XCampaign.Value)
{
LIST_WordsToAdd.Add(KVP.Value.To_AA_Word_100());
}
DB.AA_Words_100.AddRange(LIST_WordsToAdd);
}
DB.SaveChanges();
(D) Documentation
According to DbContext.Add
documentation
Begins tracking the given entity, and any other reachable entities that are not already being tracked, in the Added state such that they will be inserted into the database when SaveChanges() is called.
In particular, when SaveChanges()
is called.
I recently migrated to EF from Linq-to-SQL in this application. Linq-to-SQL did not have this problem.
What reason could there be for the DB.AA_Words_100.Add(...)
command being so slow?
Thank you!
#Update - To_AA_Word_11() Code
public AA_Words_100 To_AA_Word_100()
{
var W = new AA_Words_100();
W.Word = Word;
W.Word_Clean = Word.Replace("'", "");
W.PhraseCount = PhraseCount;
W.Clicks = Clicks;
W.Impressions = Impressions;
W.SalesAmt = SalesAmt;
W.SalesOrders = SalesOrders;
W.SalesUnits = SalesUnits;
W.Spend = Spend;
W.Campaign = Campaign;
return W;
}
Upvotes: 2
Views: 760
Reputation: 101483
As stated in comments - Entity Framework (not sure about Entity Framework Core) by default calls DetectChanges
on every Add
. This function, among other things, scans all entities already tracked by a context to detect changes in and between them. That means time complexity of this function is O(n), where n is number of entities already tracked by a context. When you do a lot of adds in a loop - time complexity becomes O(n^2), where n is total number of items added. So even with tiny numbers like 3000 rows, perfomance drops down very significatly.
To fix this (arguable design) issue there are couple of options:
set AutoDetectChangesEnabled of context to false
. Then manually call DetectChanges
before SaveChanges
.
or use AddRange
instead of adding entities one by one, it calls DetectChanges
just once.
Another notes:
Try to avoid reusing context between operations. You said there was already 3000 entities tracked by context before you called first Add. It's better to create new context every time you need it, do the stuff, then dispose it. Perfomance impact is negligible (and connections are managed by connection pool and are not necessary open or close every time you create\dispose a context), but you will have much less problems like this one (reusing context can bite not only in the scenario you have now, but in several others).
Use AsNoTracking
queries if you do not intend to modify entities returned by specific query (or if you intend to modify some of them later by attaching to context). Then context will not track them which will reduce the possibility of mentioned and other perfomance problems.
As for Linq To Sql - it has a similar concept of "detect changes", but it is automatically called only before commiting changes to database, not on every add, so you do not see the same problem there.
Upvotes: 4