Stuart
Stuart

Reputation: 1151

xmlNode.SelectSingleNode always returns same value even though the node changes

I am reading in a bunch of XML files, transforming them and loading the data in to another system.

Previously I had done this using ThreadPool, however the provider of the files and therefore the structure has changed, so I'm now trying Aysync-Await and getting an odd result.

As I process the files I get a list of the xmlNodes and loop over them

foreach (XmlNode currentVenue in venueNodes)
{
      Console.WriteLine(currentVenue.OuterXml);
      Console.WriteLine(currentVenue.SelectSingleNode(@"//venueName").InnerText);
}

however the second WriteLine always returns the result expected for the first node, example:

<venue venueID="xartrix" lastModified="2012-08-20 10:49:30"><venueName>Artrix</venueName></venue>
Artrix
<venue venueID="xbarins" lastModified="2013-04-29 11:39:07"><venueName>The Barber Institute Of Fine Arts, University Of Birmingham</venueName></venue>
Artrix
<venue venueID="xbirmus" lastModified="2012-11-13 16:41:13"><venueName>Birmingham Museum &amp; Art Gallery</venueName></venue>
Artrix

here is the complete code:

public async Task ProcessFiles()
{
    string[] filesToProcess = Directory.GetFiles(_filePath);
    List<Task> tasks = new List<Task>();

    foreach (string currentFile in filesToProcess)
    {
        tasks.Add(Task.Run(()=>processFile(currentFile)));
    }

    await Task.WhenAll(tasks);

}

private async Task processFile(string currentFile)
{
    try
    {
         XmlDocument currentXmlFile = new XmlDocument();
         currentXmlFile.Load(currentFile);

         //select nodes for processing
         XmlNodeList venueNodes = currentXmlFile.SelectNodes(@"//venue");

         foreach (XmlNode currentVenue in venueNodes)
         {
              Console.WriteLine(currentVenue.InnerXml);
              Console.WriteLine(currentVenue.SelectSingleNode(@"//venueName").InnerText);                 
         }
     }
     catch (Exception e)
     {
         Console.WriteLine(e.Message);
     }
 }

Obviously I've missed something, but I cannot see what, can someone point it out please?

Upvotes: 6

Views: 4933

Answers (2)

fourpastmidnight
fourpastmidnight

Reputation: 4234

SelectSingleNode returns only a single node in document order from the document. @jbl is correct, //venueName starts from the document root. The // xpath operator is the "descendent selector" operator.

I work with XML and XPath often and this is a common mistake. You need to make sure that your context node is correct when calling SelectSingleNode. So, like we just all said, using //venueName gets the first <venueName /> node in document order starting from the root of the document.

In order to get the <venueName /> node that is a child of the current node you're iterating over, you need to use the following code:

foreach (XmlNode currentVenue in venueNodes)
{
    Console.WriteLine(currentVenue.OuterXml);
    // The '.' in the XPath expression in the call to 
    // SelectSingelNode below means from the current node.
    // Without it, searching starts from the document root, and 
    // not from currentVenue.
    Console.WriteLine(
        currentVenue.SelectSingleNode(@".//venueName").InnerText
    ); 
}

That should solve your problem.

Upvotes: 14

jbl
jbl

Reputation: 15403

Doesn't //venueName search from the document root ?

I guess that, combined with SelectSingleNode, will always end-up on the same resulting node (the first venueName node of the document)

You may try replacing //venueName with venueName

Upvotes: 2

Related Questions