Reputation: 847
I have the following XML file
<lexicon>
<word>
<base>a</base>
<category>determiner</category>
<id>E0006419</id>
</word>
<word>
<base>abandon</base>
<category>verb</category>
<id>E0006429</id>
<ditransitive/>
<transitive/>
</word>
<word>
<base>abbey</base>
<category>noun</category>
<id>E0203496</id>
</word>
<word>
<base>ability</base>
<category>noun</category>
<id>E0006490</id>
</word>
<word>
<base>able</base>
<category>adjective</category>
<id>E0006510</id>
<predicative/>
<qualitative/>
</word>
<word>
<base>abnormal</base>
<category>adjective</category>
<id>E0006517</id>
<predicative/>
<qualitative/>
</word>
<word>
<base>abolish</base>
<category>verb</category>
<id>E0006524</id>
<transitive/>
</word>
</lexicon>
I need to read this file with C# application, and if only the category
is verb
I want to print its entire element word
.
How can I do that?
Upvotes: 31
Views: 134479
Reputation: 14386
You could use an XPath, too. A bit old fashioned but still effective:
using System.Xml;
...
XmlDocument xmlDocument;
xmlDocument = new XmlDocument();
xmlDocument.LoadXml(xml);
foreach (XmlElement xmlElement in
xmlDocument.DocumentElement.SelectNodes("word[category='verb']"))
{
Console.Out.WriteLine(xmlElement.OuterXml);
}
Upvotes: 8
Reputation: 1582
This is how I would do it (the code below has been tested, full source provided below), begin by creating a class with common properties
class Word
{
public string Base { get; set; }
public string Category { get; set; }
public string Id { get; set; }
}
load using XDocument with INPUT_DATA for demonstration purposes and find element name with lexicon . . .
XDocument doc = XDocument.Parse(INPUT_DATA);
XElement lex = doc.Element("lexicon");
make sure there is a value and use linq to extract the word elements from it . . .
Word[] catWords = null;
if (lex != null)
{
IEnumerable<XElement> words = lex.Elements("word");
catWords = (from itm in words
where itm.Element("category") != null
&& itm.Element("category").Value == "verb"
&& itm.Element("id") != null
&& itm.Element("base") != null
select new Word()
{
Base = itm.Element("base").Value,
Category = itm.Element("category").Value,
Id = itm.Element("id").Value,
}).ToArray<Word>();
}
The where
statement checks if the category element exists and that the category value is not null and then check it again that it is a verb. Then check that the other nodes also exists . . .
The linq query will return an IEnumerable< Typename > object, so we can call ToArray< Typename >() to cast the entire collection into the type we want.
Then print it to get . . .
[Found]
Id: E0006429
Base: abandon
Category: verb
[Found]
Id: E0006524
Base: abolish
Category: verb
Full Source:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Xml.Linq;
namespace test
{
class Program
{
class Word
{
public string Base { get; set; }
public string Category { get; set; }
public string Id { get; set; }
}
static void Main(string[] args)
{
XDocument doc = XDocument.Parse(INPUT_DATA);
XElement lex = doc.Element("lexicon");
Word[] catWords = null;
if (lex != null)
{
IEnumerable<XElement> words = lex.Elements("word");
catWords = (from itm in words
where itm.Element("category") != null
&& itm.Element("category").Value == "verb"
&& itm.Element("id") != null
&& itm.Element("base") != null
select new Word()
{
Base = itm.Element("base").Value,
Category = itm.Element("category").Value,
Id = itm.Element("id").Value,
}).ToArray<Word>();
}
//print it
if (catWords != null)
{
Console.WriteLine("Words with <category> and value verb:\n");
foreach (Word itm in catWords)
Console.WriteLine("[Found]\n Id: {0}\n Base: {1}\n Category: {2}\n",
itm.Id, itm.Base, itm.Category);
}
}
const string INPUT_DATA =
@"<?xml version=""1.0""?>
<lexicon>
<word>
<base>a</base>
<category>determiner</category>
<id>E0006419</id>
</word>
<word>
<base>abandon</base>
<category>verb</category>
<id>E0006429</id>
<ditransitive/>
<transitive/>
</word>
<word>
<base>abbey</base>
<category>noun</category>
<id>E0203496</id>
</word>
<word>
<base>ability</base>
<category>noun</category>
<id>E0006490</id>
</word>
<word>
<base>able</base>
<category>adjective</category>
<id>E0006510</id>
<predicative/>
<qualitative/>
</word>
<word>
<base>abnormal</base>
<category>adjective</category>
<id>E0006517</id>
<predicative/>
<qualitative/>
</word>
<word>
<base>abolish</base>
<category>verb</category>
<id>E0006524</id>
<transitive/>
</word>
</lexicon>";
}
}
Upvotes: 5
Reputation: 236308
XDocument xdoc = XDocument.Load(path_to_xml);
var word = xdoc.Elements("word")
.SingleOrDefault(w => (string)w.Element("category") == "verb");
This query will return whole word XElement
. If there is more than one word element with category verb
, than you will get an InvalidOperationException
. If there is no elements with category verb
, result will be null
.
Upvotes: 2
Reputation: 31464
Alternatively, you can use XPath query via XPathSelectElements
method:
var document = XDocument.Parse(yourXmlAsString);
var words = document.XPathSelectElements("//word[./category[text() = 'verb']]");
Upvotes: 2
Reputation: 17680
You could use linq to xml.
var xmlStr = File.ReadAllText("fileName.xml");
var str = XElement.Parse(xmlStr);
var result = str.Elements("word").
Where(x => x.Element("category").Value.Equals("verb")).ToList();
Console.WriteLine(result);
Upvotes: 31