Reputation: 1631
I need to count the number of nodes that are places directly under the root element in an XML stream. I don't care about any of the subnodes.
For example, for the following XML it should return 4:
<?xml version="1.0" encoding="utf-8"?>
<root>
<node1>
<subnode1_1>
<subnode_1_1_1>
<subnode_1_1_1_1>…</subnode_1_1_1_1>
</subnode_1_1_1>
<subnode_1_1_2>…</subnode_1_1_2>
</subnode1_1>
</node1>
<node2 />
<node3>
<subnode3_1>…</subnode3_1>
<subnode3_2>…</subnode3_2>
<subnode3_3>…</subnode3_3>
</node3>
<node4>…</node4>
</root>
What is a most efficient (I care about execution time) way to do this in C#? Assume that I have and XML body as Stream
.
Upvotes: 2
Views: 1057
Reputation: 113322
You're unlikely to get more efficient than:
public static int GetImmediateChildrenCount(Stream stm)
{
using(stm)
{
XmlReaderSettings settings = new XmlReaderSettings();
settings.CheckCharacters = false; //optomisation - best avoided.
settings.DtdProcessing = DtdProcessing.Ignore;
int count = 0;
using(XmlReader rdr = XmlReader.Create(stm, settings))
while(rdr.Read())
if(rdr.NodeType == XmlNodeType.Element && rdr.Depth == 1)
++count;
return count;
}
}
Without actually writing a specialised parser to do just that.
The above scans through the an XmlReader
ignoring everything except what the depth of starting, ending and empty element tags are, and incrementing its tally if the depth is 1
; that is, directly below the root node.
It's certainly going to be faster than anything that constructs an XDocument
or XmlDocument
because it doesn't spend time and memory doing so, though if you were going to use the XDocument
or XmlDocument
for something else, then those approaches would be faster (the counting bit is fast for them, and the time spent constructing the object is already spent).
If you were going to read several such documents and they had a lot of xml names (element and attribute names, namespace names and namespace prefixes) in common, then you would do well to keep a cache of NameTable
objects that you passed into the settings.NameTable
property. NameTables aren't thread-safe, so you can't just use the same one, but they are most expensive when "learning" new names, and reusing them gives a subsequent performance boost. But this is true only if there are a lot of names the same in each document; if the documents are very different, they don't benefit from the "prior knowledge" and you're just wasting cycles moving them around instead of garbage collecting the default one you are given with each new XmlReader
. (In fact you're making their lookups very slightly slower).
If you really want the absolutely most efficient possible, then you can beat the above by reading through the stream and keeping track of <...>
, </...>
and <.../>
, but you also have to handle a bunch of special cases, so your gain over the above is unlikely to be enough to make the effort worth it.
Rough figures for 10000 iterations with your example:
XmlDocument: 2387373
XDocument: 1942206
XmlReader: 1872387
XmlReader with reused NameTable: 1864708
Rough figures for 100 iterations with a 136KiB file based on your example:
XmlDocument: 1887930
XDocument: 1297059
XmlReader: 996636
XmlReader with reused NameTable: 961763
Upvotes: 3
Reputation: 25370
You can knock it out in one line using Linq To XML:
var count = XDocument.Load(stream).Root.Elements.Count();
//count = 4
As far as efficiency, between the two answers given, my results are:
var sw = Stopwatch.StartNew();
XmlDocument xml = new XmlDocument();
xml.Load(stream);
int i = xml.LastChild.ChildNodes.Count;
sw.Stop();
//971 ticks
and
var sw = Stopwatch.StartNew();
var count = XDocument.Load(stream).Root.Elements().Count();
sw.Stop();
//860 ticks
A pretty negligible difference really, unless you're doing many many iterations
Upvotes: 4
Reputation: 3497
Easy as this:
XmlDocument xml = new XmlDocument();
xml.Load(/*path to your file*/);
int i = xml.LastChild.ChildNodes.Count; //as the xml header is first child
Console.WriteLine(i.ToString());
Or as @Jonesy says:
int i = XDocument.Load(/*your stream*/).Root.Elements.Count();
Console.WriteLine(i.ToString());
Both will ouput 4
.
Upvotes: 2