Ren85
Ren85

Reputation: 65

LINQ / Xpath query for ungrouped and repeated XML elements

I am new to .NET and I am having some trouble implementing queries in LINQ to XML.

I have a XML file in a strange format:

<calendar>
    <event>
        <amount>1200</amount>
        <age>40</age>
        <country>FR</country>

        <amount>255</amount>
        <age>16</age>
        <country>UK</country>

        <amount>10524</amount>
        <age>18</age>
        <country>FR</country>

        <amount>45</amount>
        <age>12</age>
        <country>CH</country>
    <event>
    <event>
        <amount>1540</amount>
        <age>25</age>
        <country>UK</country>

        <amount>255</amount>
        <age>31</age>
        <country>CH</country>

        <amount>4310</amount>
        <age>33</age>
        <country>FR</country>

        <amount>45</amount>
        <age>17</age>
        <country>FR</country>
    <event>
</calendar>

From this file I want to compute the sum of every <amount> element value, where <age> is greater than '20' and <country> is either 'FR' or 'CH'.

This operation is independent of the tag <event> (all <amount> elements that check the above conditions should be summed, whether they're under the same or different <event> elements).

My problem is that I have no element tag that groups <amount>, <age> and <country> together... (I can't change the XML format, I'm consuming it from a Web Service I can't access).

If I had an hypothetical <transfer> tag grouping these triples together, I think the code would be simply:

XElement root = XElement.Load("calendar.xml");
IEnumerable<XElement> sum =
    from trf in root.Elements("events").Elements("transfers")
    where (decimal) trf.Element("age") > 20 &&
          ((string) trf.Element("Country") == "FR" ||
       (string) trf.Element("Country") == "cH")
    select trf.Element("Amount").Sum();

Should I programatically group these elements? Thanks in advance!

Upvotes: 2

Views: 839

Answers (4)

Anders
Anders

Reputation: 15397

Well... I'm not sure how you would accomplish that in LINQ, but here's an XPath query that works for me on the data you provided:

Edit:

  1. returns nodes:

    //*[text()='FR' or text()='CH']/preceding::age[number(text())>20][1]/preceding::amount[1]
    
  2. returns sum:

    sum(//*[text()='FR' or text()='CH']/preceding::age[number(text())>20][1]/preceding::amount[1]/text())
    

Upvotes: 1

Paolo Falabella
Paolo Falabella

Reputation: 25844

If I were you, I would just pre-process the Xml (maybe reading it node by node with a XmlReader ) and read it in a more hierarchical structure. That would make it easier to search for elements and also to sort or filter them without losing their relationship (which is now based solely on their order).

EDIT (see discussion in the comments) As far as I know, the xml specification does not say that the order of the elements is significant, so the parsers you use (or any pre-processing of the Xml as a whole or extraction of its elements) could change the order of amount, age and country elements at the same level.

While I think most operations tend to preserve the document order, the possibility of subtle and hard-to-find bugs due to random reorderings would not let me sleep too well...

Upvotes: 2

Dimitre Novatchev
Dimitre Novatchev

Reputation: 243479

Use:

sum(/*/*/amount
      [following-sibling::age[1] > 20
     and
       contains('FRCH',
                following-sibling::country[1])
      ])

XSLT - based verification:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="/">
  <xsl:value-of select=
   "sum(/*/*/amount
          [following-sibling::age[1] > 20
         and
           contains('FRCH',
                    following-sibling::country[1])
          ])"/>
 </xsl:template>
</xsl:stylesheet>

when this transformation is applied on the provided XML document:

<calendar>
    <event>
        <amount>1200</amount>
        <age>40</age>
        <country>FR</country>
        <amount>255</amount>
        <age>16</age>
        <country>UK</country>
        <amount>10524</amount>
        <age>18</age>
        <country>FR</country>
        <amount>45</amount>
        <age>12</age>
        <country>CH</country>
    </event>
    <event>
        <amount>1540</amount>
        <age>25</age>
        <country>UK</country>
        <amount>255</amount>
        <age>31</age>
        <country>CH</country>
        <amount>4310</amount>
        <age>33</age>
        <country>FR</country>
        <amount>45</amount>
        <age>17</age>
        <country>FR</country>
    </event>
</calendar>

the XPath expression is evaluated and the wanted, correct result is output:

5765

Do note: The currently selected answer contains wrong XPath expressions and the sum they produce is wrong. See this illustrated in the XSLT transformation below (the first number is the correct result, the second number is produced using the XPath expressions from the accepted answer:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="/">
  <xsl:value-of select=
   "sum(/*/*/amount
          [following-sibling::age[1] > 20
         and
           contains('FRCH',
                    following-sibling::country[1])
          ])"/>

============
  <xsl:value-of select="sum(//*[text()='FR' or text()='CH']/preceding::age[number(text())>20][1]/preceding::amount[1]/text())"/>

 </xsl:template>
</xsl:stylesheet>

Result:

5765

============
  12475

Upvotes: 1

Abdul Munim
Abdul Munim

Reputation: 19217

Try this:

var xe = XElement.Load(@"calendar.xml");
var langs = new List<string> { "FR", "CH" };

var sum = xe.Descendants("amount")
    .Where(e =>
           Convert.ToInt32(e.ElementsAfterSelf("age").First().Value) > 20 &&
           langs.Any(l => l == e.ElementsAfterSelf("country").First().Value))
    .Select(e => Convert.ToDouble(e.Value)).Sum();

I have tested the code. You also have to make sure that amount element must be the first element in the group.

Upvotes: 2

Related Questions