Reputation: 58
I have a large XML file with the following structure.
<tree>
<limb>
<DATA0>
</limb>
<limb>
<DATA1>
</limb>
<limb>
<DATA2>
</limb>
</tree>
There are several thousand limb elements, each with child elements. I need to parse through this file, and extract the limb elements in sets of 100 - 200 items, and create a new XML file from the data.
Is there a preferred method for performing this operation? I only know C# at an Novice/Intermediate level, and have worked for a while with XML files.
I am considering writing a loop that counts the total number of limb elements, performing a calculation to determine the number of new XML documents I will need (5000 limb elements / batches of 200 == 25 xmldocuments). From there I would need to read the first 200 sets, copy them into a new file, save it, and start again until the end of the file.
Does my logic seem flawed?
Upvotes: 1
Views: 3870
Reputation: 163322
There might be an excuse to write this in C# if you were expert in C# and didn't have time to learn anything else, but since that isn't the case, XSLT is a much better tool for the job - especially XSLT 2.0, since that can produce multiple output files. (There are two XSLT 2.0 processors you can use in a C# environment - Saxon and XQSharp). It looks a very simple job in XSLT, something like:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:template match="/">
<xsl:for-each-group select="//limb" group-adjacent="(position()-1) idiv 200">
<xsl:result-document href="batch{position()}.xml">
<batch>
<xsl:copy-of select="current-group()"/>
</batch>
</xsl:result-document>
</xsl:for-each-group>
</xsl:template>
</xsl:stylesheet>
Upvotes: 0
Reputation: 11955
Linq-To-XML as Robert linked would look like:
XElement xfile = XElement.Load(file);
var limbs = xfile.Elements("limb");
int count = limbs.Count();
var first200 = limbs.Take(200);
var next200 = limbs.Skip(200).Take(200);
Upvotes: 2
Reputation: 5895
If the document is too large to load into memory, you can use XmlReader. You create your own subclass of XmlReader. Unless the file is greater than, say, 10-20% the size of your RAM, or you need it to be fast, it probably isn't worth the extra effort, though.
Upvotes: 2