Reputation: 31586

Better way to do this LINQ to XML query?

So say I have this XML file:

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<Root>
  <Category Name="Tasties">
    <Category Name="Pasta">
      <Category Name="Chicken">
        <Recipe Name="Chicken and Shrimp Scampi" />
        <Recipe Name="Chicken Fettucini Alfredo" />
      </Category>
      <Category Name="Beef">
        <Recipe Name="Spaghetti and Meatballs" />
        <Recipe Name="Lasagna" />
      </Category>
      <Category Name="Pork">
        <Recipe Name="Lasagna" />
      </Category>
      <Category Name="Seafood">
        <Recipe Name="Chicken and Shrimp Scampi" />
      </Category>
    </Category>
  </Category>
</Root>

And I want to return the names of all the recipes in Tasties\Pasta\Chicken, how would I do this?

What I have currently is:

var q = from chk in
            (from c in doc.Descendants("Category")
             where c.Attribute("Name").Value == "Chicken"
             select c)
        select from r in chk.Descendants("Recipe")
               select r.Attribute("Name").Value;

foreach (var recipes in q)
{
    foreach (var recipe in recipes)
    {
        Console.WriteLine("Recipe name = {0}", recipe);
    }
}

Which kinda works, although it doesn't check the path, only for the first category named Chicken. I could dig through each element in the path recursively, but it seems like there probably is a better solution I'm missing. Also my current query returns IEnumerable<IEnumerable<String>> when all I want is just an IEnumerable<String>.

Basically I can make it work but it looks messy and I'd like to see any LINQ suggestions or techniques to do better querying.

Upvotes: 1

Answers (4)

OdeToCode

Reputation: 4986

A little bit late, but extension methods can really help to clean up messy looking LINQ to XML queries. For your scenario you could work with code like this:

var query = xml.Root
               .Category("Tasties")
               .Category("Pasta")
               .Category("Chicken")
               .Recipes();

... using some techniques I show in From LINQ To XPath And Back Again

Upvotes: 1

Marc Gravell

Reputation: 1062780

Personally, I'd use XmlDocument and the familiar SelectNodes:

foreach(XmlElement el in doc.DocumentElement.SelectNodes(
   "Category[@Name='Tasties']/Category[@Name='Pasta']/Category[@Name='Chicken']/Recipe")) {
    Console.WriteLine(el.GetAttribute("Name"));
}

For LINQ-to-XML, I'd guess (untested) something like:

var q = from c1 in doc.Root.Elements("Category")
        where c1.Attribute("Name").Value == "Tasties"
        from c2 in c1.Elements("Category")
        where c2.Attribute("Name").Value == "Pasta"
        from c3 in c2.Elements("Category")
        where c3.Attribute("Name").Value == "Chicken"
        from recipe in c3.Elements("Recipe")
        select recipe.Attribute("Name").Value;
foreach (string name in q) {
    Console.WriteLine(name);
}

Edit: if you want the category selection to be more flexible:

    string[] categories = { "Tasties", "Pasta", "Chicken" };
    XDocument doc = XDocument.Parse(xml);
    IEnumerable<XElement> query = doc.Elements();
    foreach (string category in categories) {
        string tmp = category;
        query = query.Elements("Category")
            .Where(c => c.Attribute("Name").Value == tmp);
    }
    foreach (string name in query.Descendants("Recipe")
        .Select(r => r.Attribute("Name").Value)) {
        Console.WriteLine(name);
    }

This should now work for any number of levels, selecting all recipes at the chosen level or below.

Edit for discussion (comments) on why Where has a local tmp variable:

This might get a bit complex, but I'm trying to do the question justice ;-p

Basically, the foreach (with the iterator lvalue "captured") looks like:

class SomeWrapper {
    public string category;
    public bool AnonMethod(XElement c) {
        return c.Attribute("Name").Value == category;
    }
}
...
SomeWrapper wrapper = new SomeWrapper(); // note only 1 of these
using(var iter = categories.GetEnumerator()) {
    while(iter.MoveNext()) {
        wrapper.category = iter.Current;
        query = query.Elements("Category")
             .Where(wrapper.AnonMethod);
    }
}

It might not be obvious, but since Where isn't evaluated immediately, the value of category (via the predicate AnonMethod) isn't checked until much later. This is an unfortunate consequence of the precise details of the C# spec. Introducing tmp (scoped inside the foreach) means that the capture happens per iteration:

class SecondWrapper {
    public string tmp;
    public bool AnonMethod(XElement c) {
        return c.Attribute("Name").Value == tmp;
    }
}
...
string category;
using(var iter = categories.GetEnumerator()) {
    while(iter.MoveNext()) {
        category = iter.Current;
        SecondWrapper wrapper = new SecondWrapper(); // note 1 per iteration
        wrapper.tmp = category;
        query = query.Elements("Category")
             .Where(wrapper.AnonMethod);
    }
}

And hence it doesn't matter whether we evaluate now or later. Complex and messy. You can see why I favor a change to the specification!!!

Upvotes: 3

Joel Mueller

Reputation: 28765

If you add a using statement for System.Xml.XPath, that will add an XPathSelectElements() extension method to your XDocument. That will let you select nodes with an XPath statement if you're more comfortable with that.

Otherwise, you can flatten out your IEnumerable<IEnumerable<String>> into just an IEnumerable<string> with SelectMany:

IEnumerable<IEnumerable<String>> foo = myLinqResults;
IEnumerable<string> bar = foo.SelectMany(x => x);

Upvotes: 1

CoderDennis

Reputation: 13837

Here's code that is similar to Marc's 2nd example, but it's tested and verified.

var q = from t in doc.Root.Elements("Category")
        where t.Attribute("Name").Value == "Tasties"
        from p in t.Elements("Category")
        where p.Attribute("Name").Value == "Pasta"
        from c in p.Elements("Category")
        where c.Attribute("Name").Value == "Chicken"
        from r in c.Elements("Recipe")
        select r.Attribute("Name").Value;

foreach (string recipe in q)
{
    Console.WriteLine("Recipe name = {0}", recipe);
}

In general, I'd say you only want a single select statement in your LINQ queries. You were getting the IEnumerable<IEnumerable<String>> because of your nested select statements.

Upvotes: 1

Better way to do this LINQ to XML query?

Answers (4)

Related Questions