Reputation: 2441
I am trying to extract the following values from an xml file: - - The xml file is represented as follow:
<ENVELOPE_CONTENT>
<DOCUMENTS>
<DOCUMENT>
<IDX>1529</IDX>
<ENTITY_PRIORITY>5</ENTITY_PRIORITY>
<CLD_COD>MAGAZINE</CLD_COD>
<CLD_DESC>Revues, magazine</CLD_DESC>
<CATEGORY>OTHER</CATEGORY>
<TIF_FILENAME>revues, magazine_1529_si.tif</TIF_FILENAME>
<COMMENT />
<REJECT_MESSAGES />
<PAGES>
<PAGE>
<PAGIDX>3375</PAGIDX>
<POSITION>1</POSITION>
<TIFNAME>87771593-2FD4-4803-8736-E2C1A898A96B_002.tif</TIFNAME>
<JPEGNAME>87771593-2fd4-4803-8736-e2c1a898a96b_001.jpg</JPEGNAME>
</PAGE>
<PAGE>
<PAGIDX>3376</PAGIDX>
<POSITION>2</POSITION>
<TIFNAME>87771593-2FD4-4803-8736-E2C1A898A96B_004.tif</TIFNAME>
<JPEGNAME>87771593-2fd4-4803-8736-e2c1a898a96b_003.jpg</JPEGNAME>
</PAGE>
<PAGE>
<PAGIDX>3377</PAGIDX>
<POSITION>3</POSITION>
<TIFNAME>87771593-2FD4-4803-8736-E2C1A898A96B_006.tif</TIFNAME>
<JPEGNAME>87771593-2fd4-4803-8736-e2c1a898a96b_005.jpg</JPEGNAME>
</PAGE>
<PAGE>
<PAGIDX>3378</PAGIDX>
<POSITION>4</POSITION>
<TIFNAME>87771593-2FD4-4803-8736-E2C1A898A96B_008.tif</TIFNAME>
<JPEGNAME>87771593-2fd4-4803-8736-e2c1a898a96b_007.jpg</JPEGNAME>
</PAGE>
</PAGES>
</DOCUMENT>
</DOCUMENTS> </ENVELOPE_CONTENT>
I am using the following c#code to extract the values
string xmlText = File.ReadAllText(f);
XmlDocument doc = new XmlDocument();
doc.LoadXml(xmlText);
XmlNodeList parentNode = doc.GetElementsByTagName("DOCUMENT");
List<string> p = new List<string>();
string classe = "";
foreach (XmlNode childrenNode in parentNode)
{
classe = childrenNode.SelectSingleNode("CLD_COD").InnerText;
}//end foreach
I managed to extract the Value from CLD_COD But i can't manage to extract the values in "TIFNAME"
How can i iterate through the nodes to extract them?
Thank you.
Upvotes: 1
Views: 2693
Reputation: 871
You may try this also
string xmlText = File.ReadAllText("C:\\Users\\virens\\Desktop\\Testxml.xml");
XmlDocument doc = new XmlDocument();
doc.LoadXml(xmlText);
XmlNodeList parentNode = doc.GetElementsByTagName("DOCUMENT");
IEnumerator testnodes = parentNode[0].ChildNodes.GetEnumerator();
List<string> p = new List<string>();
string classe = "";
while (testnodes.MoveNext())
{
XmlNode node = (XmlNode)testnodes.Current;
if (node.Name == "TIF_FILENAME")
{
Console.WriteLine("Yai I got it");
Console.WriteLine(node.InnerText);
}
}
Upvotes: 2
Reputation: 1859
If you really have to use the older XmlDocument, you can try something like this:
XmlDocument doc = new XmlDocument();
doc.LoadXml(xmlText);
XmlNodeList xn = doc.SelectNodes("/ENVELOPE_CONTENT/DOCUMENTS/DOCUMENT/PAGES/PAGE/TIFNAME");
foreach (XmlNode xnode in xn)
{
//extract values here
Console.WriteLine(xnode.InnerText);
}
Upvotes: 1
Reputation: 236218
You can use LINQ to XML:
XDocument xdoc = XDocument.Load(f);
var cldCod = (string)xdoc.Descendants("CLD_COD").FirstOrDefault();
var names = from p in xdoc.Descendants("PAGE")
select (string)p.Element("TIFNAME");
Another option is XPath extensions. You can specify exact path to elements to avoid whole xml lookup:
var root = xdoc.Root;
var cldCod = (string)root.XPathSelectElement("DOCUMENTS/DOCUMENT/CLD_COD");
var names = from n in root.XPathSelectElements("DOCUMENTS/DOCUMENT/PAGES/PAGE/TIFNAME")
select (string)n;
Upvotes: 2
Reputation: 273244
First of all, this is a lot easier with the newer XML API Linq-2-XML (XLinq).
var root = XElement.Parse(xmlText); // or directly .Load(fileName)
List<string> tifNames = root.Descendants("TIFNAME").Select(e => e.Value);
Upvotes: 3