Reputation: 12037
There's a lot of information and answers on here and elsewhere about how to speed up bulk inserts with EF.
However, I'm only inserting about 10,000 records and the slow part of creating the entities.
First, I save the external data to a datatable, then loop through and for each row create a new entity
, add the child entities in the loop (sourced from a couple of other datatables) and once the loop is finished call db.SaveChanges()
.
In my initial profiling, the db.SaveChanges()
is slow, but nothing compared to the loop of creating all the objects in the first place - either as a separate List<entity> entities
or directly to the context.
Is this usual? As of all the issues with bulk insert I can see, most seem to be related to the final commit to the database.
Edit to add some code Please excuse the psudo-code
DataTable ref1 = ConvertCSVtoDataTable(csv, firstRowsToDelete: 15); // Return's a Datatable from a CSV
foreach(string file in ListOfFilesToProcess)
{
DataTable tbl = loadExcelFiles(file);
foreach(DataRow dr in tbl.Rows)
{
Entity newEntity = new Entity();
Entity.property1 = dr["Property1"].ToString();
... // Keep mapping properties to elements in the datatable
Entity.Child.Add(new ChildEntity() { prop1 = ref1["ChildProp1"].ToString() });
// Add the newly created entity to the context
db.Entity.Add(newEntity);
}
}
// Save the context
db.SaveChanges();
So by the point of saving the context there's a few thousand newEntity
objects and their child navigation
objects.
Iterating over the datatable and creating all these objects is the part which is the slowest!
Also, db.Configuration.AutoDetectChangesEnabled = false;
has already been set.
Upvotes: 1
Views: 7814
Reputation: 11337
The slow part in your “creating all the objects” section is the Detect Changes.
You should always use AddRange over Add
This code should fix the slow part when creating objects:
DataTable ref1 = ConvertCSVtoDataTable(csv, firstRowsToDelete: 15); // Return's a Datatable from a CSV
List<Entity> list = new List<Entity>();
foreach(string file in ListOfFilesToProcess)
{
DataTable tbl = loadExcelFiles(file);
foreach(DataRow dr in tbl.Rows)
{
Entity newEntity = new Entity();
Entity.property1 = dr["Property1"].ToString();
... // Keep mapping properties to elements in the datatable
Entity.Child.Add(new ChildEntity() { prop1 = ref1["ChildProp1"].ToString() });
list.Add(newEntity);
}
}
// Add all newly created entities to the context
db.Entity.AddRange(list);
// Save the context
db.SaveChanges();
If after this fix, you still have some issue performance (from database this time), you should try a BulkSaveChanges / Bulk Insert third party library.
Here is an article about these libraries: Entity Framework - Bulk Insert Library Reviews & Comparisons.
Disclaimer: I'm the owner of the project Entity Framework Extensions
Upvotes: 1
Reputation: 70307
Just use a bulk insert. Even if you get past EF's ridiculously bad performance, you still have to contend with the database not liking individual inserts.
Upvotes: 1