Reputation:
is there a best practice to get the number of elements from an XML document for progress reporting purposes? I have an 2 GB XML file containing flights which I need to process and my idea is to first get the number of all elements in the file and then use a counter to show x of x flights are imported to our database.
For the file processing we are using the XmlTextReader in .NET (C#) to get the data without reading the whole document into memory (similiar to sax parsing).
So the question is, how can I get the number of those elements very quick... is there a best practice or should I go through the whole document first and doe something like i++; ?
Thanks!
Upvotes: 6
Views: 6348
Reputation: 16708
int count = 0;
using (XmlReader xmlReader = new XmlTextReader(new StringReader(text)))
{
while (xmlReader.Read())
{
if (xmlReader.NodeType == XmlNodeType.Element &&
xmlReader.Name.Equals("Flight"))
count++;
}
}
Upvotes: 1
Reputation: 1502116
You certainly can just read the document twice - once to simply count the elements (keep using XmlReader.ReadToFollowing
for example, (or possibly ReadToNextSibling
) increasing a counter as you go:
int count = 0;
while (reader.ReadToFollowing(name))
{
count++;
}
However, that does mean reading the file twice...
An alternative is to find the length of the file, and as you read through the file once, report the percentage of the file processed so far, based on the position of the underlying stream. This will be less accurate, but far more efficient. You'll need to create the XmlReader
directly from a Stream
so that you can keep checking the position though.
Upvotes: 7