NDraskovic
NDraskovic

Reputation: 706

Copying data from XML into a new file

I have an XML file that contains over 50 000 records (and the future ones might have up to 500 000 records). Each record has 3 levels - main level (used to distinguish records), common data level (tags contain attributes that define each record) and the third level contains the data specific for each record (mostly as attributes, but sometimes as inner text). My task is to "dissect" this file into multiple smaller files. There is an attribute on the third level that determines in which group does the whole record belong.

The algorithm should go like this:

For each record in the file:

So my question is what is the easiest (and most efficient way) to copy data into a new file? Keep in mind that I need to copy the entire record, not just some specific data. I'm working in C# using VS 2010.

Upvotes: 0

Views: 578

Answers (3)

Pablo Romeo
Pablo Romeo

Reputation: 11396

The most efficient way (regarding performance) would be to have a single XmlReader instance, going through your large file. Since you have several groups that could be the destination, you should have multiple instances of XmlWriter, which you would create on demand and store in a dictionary indexed by "group key", for the next iteration.

Using XmlReader and XmlWriter you avoid loading the entire file in memory.

To keep track of the nested levels you go through you could use a Stack, pushing the items as you navigate inwards and popping as you navigate outwards, or just local variables in your method.

Don't forget to close your Stream instances when you are done.

Upvotes: 1

Saroop Trivedi
Saroop Trivedi

Reputation: 2265

Through System.Xml you can perform the operation. Create the List<XmlElement> and cover your three levels of each.

   XmlDocument doc = new XmlDocument();
      doc.Load("Test.xml");
      XmlElement root = doc.DocumentElement;
    //Preform your read and write operation here
     doc.Save("Test.xml");

Upvotes: 0

Darin Dimitrov
Darin Dimitrov

Reputation: 1038930

You could use a XmlReader to progress through the nodes of the source file and once you encounter a node that meets your requirements simply read it and copy to a new file (The InnerXml property of the current node will give you its entire string representation that you could store to a new file).

By the way if you expect your XML to grow to sizes of millions of records I would recommend you to anticipate this growth in advance and switch to a database which is more adapted for handling such volumes of data.

Upvotes: 1

Related Questions