musefan
musefan

Reputation: 48435

Using XPath to parse an XML document

Lets say I have the following xml (a quick example)

<rows>
   <row>
      <name>one</name>
   </row>
   <row>
      <name>two</name>
   </row>
</rows>

I am trying to parse this by using XmlDocument and XPath (ultimately so I can make a list of rows).

For example...

XmlDocument doc = new XmlDocument();
doc.LoadXml(xml);

foreach(XmlNode row in doc.SelectNodes("//row"))
{
   string rowName = row.SelectSingleNode("//name").InnerText;
}

Why, within my foreach loop, is rowName always "one"? I am expecting it to be "one" on the first iteration and "two" on the second.

It seems that //name gets the first instance in the document, rather than the first instance in the row as I would expect. After all, I am calling the method on the "row" node. If this is "just how it works" then can anybody please explain how I could change it to work to my needs?

Thank you

Upvotes: 16

Views: 49422

Answers (8)

K P Verma
K P Verma

Reputation: 1

Let's take an example of XML as below to fetch data of document using XPath

<?xml version="1.0" encoding="UTF-8" ?>    <!DOCTYPE svg (View Source for full doctype...)>   <!-- Created with AIM.   -->   <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 1668.75 1074.75" xmlns:xlink="http://www.w3.org/1999/xlink" preserveAspectRatio="xMidYMid meet" zoomAndPan="magnify" version="1.0" contentScriptType="text/ecmascript" contentStyleType="text/css">   can't read ": no such element in array    <g id="#1" class="track" />  <g id="#5" class="dedication">  <metadata>   <meta name="color">Red</meta>    </metadata>   <text fill="#181818">AQWSD</text>    </g>  <g id="#6" class="wordasword">  <metadata>   <meta name="epigraph">Output 1</meta>    <meta name="color">Red</meta>    <meta name="refentry">qandadiv</meta>    </metadata>   <paramdef fill="none" />    <text fill="#181818">0.35</text>    </g>  <g id="#7" class="wordasword">  <metadata>   <meta name="epigraph">Output 2</meta>    <meta name="color">Red</meta>    <meta name="refentry">calloutlist</meta>    <meta name="screen">common></meta>   </metadata>   <path fill="none" />    <text fill="#181818">lineannotation</text>    <text fill="#181818">WHO</text>    <paramdef fill="#232323" />    </g>  <g id="#" class="wordasword">  <metadata>   <meta name="epigraph">Output 3</meta>    <meta name="color">Red</meta>    <meta name="refentry">calloutlist</meta>    <meta name="screen">common></meta>    </metadata>   <path fill="none" />    <text fill="#181818">lineannotation</text>    <text fill="#181818">WHO</text>    <paramdef fill="#232323" />    </g>   </svg>

I have checked and build the code below that is working correctly.

Below is run-time value of above mentioned XML document as xmlContent.

 var xmlContent = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\r\n<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG
1.0//EN\" \"http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.dtd\">\r\n<!--Created with AIM.-->\r\n<svg xmlns=\"http://www.w3.org/2000/svg\" viewBox=\"0 0 1668.75 1074.75\">\r\ncan't read \": no such element in array<g id=\"#1\" class=\"track\"></g><g id=\"#5\" class=\"dedication\">\r\n<metadata>\r\n<meta name=\"color\">Red</meta>\r\n</metadata>\r\n<text fill=\"#181818\">AQWSD</text>\r\n</g>\r\n<g id=\"#6\" class=\"wordasword\">\r\n<metadata>\r\n<meta name=\"epigraph\">Output 1</meta>\r\n<meta name=\"color\">Red</meta>\r\n<meta name=\"refentry\">qandadiv</meta>\r\n</metadata>\r\n<paramdef fill=\"none\" />\r\n<text fill=\"#181818\">0.35</text>\r\n</g>\r\n<g id=\"#7\" class=\"wordasword\">\r\n<metadata>\r\n<meta name=\"epigraph\">Output 2</meta>\r\n<meta name=\"color\">Red</meta>\r\n<meta name=\"refentry\">calloutlist</meta>\r\n<meta name=\"screen\">common></meta>\r\n</metadata>\r\n<path fill=\"none\" />\r\n<text fill=\"#181818\">lineannotation</text>\r\n<text fill=\"#181818\">WHO</text>\r\n<paramdef fill=\"#232323\"/>\r\n</g>\r\n<g id=\"#\" class=\"wordasword\">\r\n<metadata>\r\n<meta name=\"epigraph\">Output 3</meta>\r\n<meta name=\"color\">Red</meta>\r\n<meta name=\"refentry\">calloutlist</meta>\r\n<meta name=\"screen\">common></meta>\r\n</metadata>\r\n<path fill=\"none\"/>\r\n<text fill=\"#181818\">lineannotation</text>\r\n<text fill=\"#181818\">WHO</text>\r\n<paramdef fill=\"#232323\"/>\r\n</g>\r\n</svg>";


XmlDocument xml = new XmlDocument();
xml.LoadXml(xmlContent);

//Select all g Nodes of class wordasword that have color red in metadata>meta 
var gNodesOnClassOfColorRed = xml.SelectNodes("//*[local-name()='g'][@class='wordasword'][*[local-name()='metadata'][*[local-name()='meta'][@name='color'] = 'Red']]").Cast<XmlNode>();

foreach (XmlNode gNode in gNodesOnClassOfColorRed)
{
    var metadata = gNode.SelectSingleNode("*[local-name()='metadata']").Cast<XmlNode>();    //Fetch metadata of g tag

    //Fetch epigraph value from meta tag from metadata
    var epigraph = metadata.Cast<XmlNode>()
                    .Where(z => z.Attributes.Count != 0 && z.Attributes.GetNamedItem("name") != null && z.Attributes.GetNamedItem("name").Value.Trim().ToLower() == "epigraph")
                    .Select(p => p.InnerText).FirstOrDefault();

    Console.WriteLine(epigraph);
} 

The above code will fetch the epigraph value from Metadata. The output of the epigraph value will be printed as

Output 1, Output 2, Output 3

The below code will fetch the text tag list of all g tags where is xml is same as above

var elementList = (XmlNodeList)xml.SelectNodes("//*[local-name()='g'][@class='wordasword'][*[local-name()='text']]");

foreach (XmlNode xmlNode in elementList)     //g
{
    XmlNodeList textList = (XmlNodeList)xmlNode.SelectNodes("*[local-name()='text']");
}  

Upvotes: 0

Him_Jalpert
Him_Jalpert

Reputation: 2516

Use the following

        doc.LoadXml(xml);

            foreach(XmlNode row in doc.SelectNodes("/rows/row"))
            {
                string rowName = row.SelectSingleNode("//name").InnerText.ToString();
            }

Upvotes: 0

Sir Crispalot
Sir Crispalot

Reputation: 4854

XmlDocument doc = new XmlDocument();
doc.LoadXml(xml);

foreach(XmlNode row in doc.SelectNodes("//row"))
{
   var rowName = row.SelectSingleNode("name");
}

Is the code you posted actually correct? I get a compile error on row.SelectNode() as it isn't a member of XmlNode.

Anyway, my example above works, but assumes only a single <name> node within the <row> node so you may need to use SelectNodes() instead of SelectSingleNode() if that is not the case.

As others have shown, use .InnerText to get just the value.

Upvotes: 19

AakashM
AakashM

Reputation: 63378

Your second xpath starts with //. This is an abbreviation for /descendant-or-self::node(), which you can see starts with /, meaning it searches from the root of the document, whatever the context in which you use it.

You probably want one of:

var rowName = row.SelectSingleNode("name");

to find the name nodes that are immediate children of the row, or

var rowName = row.SelectSingleNode(".//name");

to find name nodes *anywhere undertherow. Note the.` in this second xpath that causes the xpath to start from the context node.

Upvotes: 4

Levi Botelho
Levi Botelho

Reputation: 25224

Use LINQ to XML. Include using System.Xml.Linq; in your code file and then do the following code to get your list

XDocument xDoc = XDocument.Load(filepath);
IEnumerable<XElement> xNames;

xNames = xDoc.Descendants("name");

That will give you a list of the name elements. Then if you want to turn that into a List<string> just do this:

List<string> list = new List<string>();
foreach (XElement element in xNames)
{
    list.Add(element.value);
}

Upvotes: 5

DaveShaw
DaveShaw

Reputation: 52808

I would use SelectSingleNode, and then the InnerText property.

var rowName = row.SelectSingleNode("name").InnerText;

Upvotes: 2

madd0
madd0

Reputation: 9323

The problem is in your second XPath query:

//row

This has a global scope, so no matter where you call it from, it will select all row elements.

It should work if you replace your expression with:

.//row

Upvotes: 2

Martin Honnen
Martin Honnen

Reputation: 167716

Use a relative path e.g. string rowName = row.SelectSingleNode("name").InnerText;.

Upvotes: 3

Related Questions