Muds
Muds

Reputation: 4116

XML Serializer fails processing large XML data

I know this must be straight forward but somehow I am not able to get my head around it.

-- Scenario --

We are using a system which has a WPF front end, then a DB first entity framework and then a SQL 2008R2 database.

most of the entities that we have are parent child relations. A parent cannot exist without children and hence the stored procedures that we have take

Parent Data and list of child data, here list of child data is passed as XML after serializing the entire list using XML Serializer.

All this was working fine until we have to save 500,000 child records. when we try to do this XML Serializer fails complaining about out of memory. Then I tried to send child records in batches of 10,000 which executes the query and saves the data but atomicity is then lost coz its just a for loop like this..

 For Each batchList In groups.Split(10000)
     Dim serializer As New XmlListSerializer(Of someObject)(batchList.ToList(), "Data")
     Dim xml = serializer.ToXmlString
     efContext.AddData(CType(addedParentId.Value, Short?), xml, "/Data/someObject")
 Next

Where add data is function import mapped to stored procedure.

-- Question --

Now my question here is, what is the right way to add multiple child records into database while keeping the atomicity of the operation active.

Also, the above mentioned way is dead slow, for hald million recortds it takes 5 minutes which is not acceptable anyway.

Upvotes: 0

Views: 114

Answers (1)

Scott Chamberlain
Scott Chamberlain

Reputation: 127543

what is the right way to add multiple child records into database while keeping the atomicity of the operation active.

The way you do this is by making a Transaction you start before you your multiple steps then commit or rollback at the end. In-fact Entity Framework is already doing this when it does multiple inserts in one SaveChanges() call. Which brings us to:

Also, the above mentioned way is dead slow, for half million records it takes 5 minutes which is not acceptable anyway.

Entity framework is NOT designed for bulk operations, each one of those 500,000 records will generate a separate INSERT INTO statement. Either use a extension for EF that provides bulk insert abilities or use old fashioned ADO.NET with SqlBulkCopy to load the data.


As for the XML issue, if your children list are that big the "correct" thing to do is don't use XML. Parent and child should be separate tables in the database, you would then have a 3rd "linking table" which would be

create table Parent_Child_Links
{
    ParentId int,
    ChildId int
}

When you have a Many to Many table relationship and you let Entity Framework make your tables for you it would have made that 3rd linking table with foreign keys pointing at the two tables.

Upvotes: 1

Related Questions