DoIt
DoIt

Reputation: 3428

Nested querying a xml document using LINQ in c#

I am trying to retrieve data from an xml document to an array in C# using LINQ where I have to use some nested querying within the elements of the xml data which is as follows

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>    
    <Catalog>
     <Book ISBN="1.1.1.1" Genre="Thriller">
      <Title  Side="1">
       <Pty R="1" ID="Seller_ID">
         <Sub Typ="26" ID="John">
         </Sub>
       </Pty>
       <Pty R="2" ID="ABC">
       </Pty>
        </Title>
    </Book>
    <Book ISBN="1.2.1.1" Genre="Thriller">
      <Title  Side="2">
       <Pty R="1" ID="Buyer_ID">
         <Sub Typ="26" ID="Brook">
         </Sub>
       </Pty>
       <Pty R="2" ID="XYZ">
       </Pty>
        </Title>
    </Book>
    </Catalog>

In the above XML document Side="1" represents a sell side and Side="2" represents a sell side. Now, I want to store above elements and attributes in an array which as fields as follows

Array ISBN Genre PublishDate Buyer_Company Seller_Company Buyer_Broker Seller_Broker

I was able to retrieve normal elements and attributes but was not sure how to deal with attributes that are dependent on other elements like Buyer_Company Seller_Company Buyer_Broker Seller_Broker which are based on Side, Pty and Sub elements like Buyer_Company is ID attribute of Pty where R= 2 and Side=2. Similarly, Buyer_Broker is ID attribute of Sub element where its attribute Typ=26 (there can be XML data with different value of Typ) and Sub element is already a child to Pty element with R=1 and which is in turn a child of Book element when Side=2

Code I used to retrieve independent elements is

var result = doc.Descendants("Book")
        .Select(b => new
        {
            ISBN= b.Attribute("ISBN").Value,
            Genre=b.Attribute("Genre").Value,
            PublishDate= b.Element("Title").Attribute("MMY").Value,        

        })
        .ToArray();

And I worked on querying within a single element as follows

  Company= (string)b.Descendants("Pty")
                             .Where(e => (int)e.Attribute("R") == 7)
                             .Select(e => e.Attribute("ID"))
                             .Single()

But this didn't consider the attribute Side in the element Book.

Sample Data

First Book Element

ISBN:1.1.1.1
Genre:Thriller
Seller_Company:NULL
Seller_Broker:NULL
Buyer_Company:ABC
Buyer_Broker:John

Second Book Element

ISBN:1.1.1.1
Genre:Thriller
Seller_Company:XYZ
Seller_Broker:Brook
Buyer_Company: NULL
Buyer_Broker:NULL

Side=1 represent a seller side and side=2 represents a buyer side which is why seller side is null in the first element of resultant array and buyer side in second element

May I know a better way to solve this?

Upvotes: 0

Views: 1295

Answers (5)

Parthasarathy
Parthasarathy

Reputation: 2818

Edited to match the question:

Using XPath:

private static string GetCompanyValue(XElement bookElement, string side, string r)
{
  string format = "Title[@Side={0}]/Pty[@R={1}]";
  return GetValueByXPath(bookElement, string.Format(format, side, r));
}

private static string GetBrokerValue(XElement bookElement, string side)
{
  string format = "Title[@Side={0}]/Pty[@R=1]/Sub[@Typ=26]";
  return GetValueByXPath(bookElement, string.Format(format, side));
}

private static string GetValueByXPath(XElement bookElement, string expression)
{
  XElement element = bookElement.XPathSelectElement(expression);
  return element != null ? element.Attribute("ID").Value : null;
}

And the calling code looks as below.

var result = doc.Descendants("Book")                            
                .Select(book => new
                {
                   ISBN = book.Attribute("ISBN").Value,
                   Genre = book.Attribute("Genre").Value,
                   Buyer_Company = GetCompanyValue(book, "2", "2"),
                   Buyer_Broker = GetBrokerValue(book, "2"),
                   Seller_Broker = GetBrokerValue(book, "1")
                })
                .ToArray();

Add a using statement to using System.Xml.XPath;

Upvotes: 2

Adam Prescott
Adam Prescott

Reputation: 945

Now that you've provided some examples, I think this will work for you.

const string xml =
    @"<?xml version=""1.0"" encoding=""UTF-8"" standalone=""yes""?>    
    <Catalog>
    <Book ISBN=""1.1.1.1"" Genre=""Thriller"">
    <Title  Side=""1"">
    <Pty R=""1"" ID=""Seller_ID"">
        <Sub Typ=""26"" ID=""John"">
        </Sub>
    </Pty>
    <Pty R=""2"" ID=""ABC"">
    </Pty>
        </Title>
    </Book>
    <Book ISBN=""1.2.1.1"" Genre=""Thriller"">
    <Title  Side=""2"">
    <Pty R=""1"" ID=""Buyer_ID"">
        <Sub Typ=""26"" ID=""Brook"">
        </Sub>
    </Pty>
    <Pty R=""2"" ID=""XYZ"">
    </Pty>
        </Title>
    </Book>
    </Catalog>";
var doc = XDocument.Parse(xml);

var results = new List<object>();
foreach (var book in doc.Descendants("Book")) {
    var title = book.Element("Title");
    string buyerCompany = null;
    string buyerBroker = null;
    string sellerCompany = null;
    string sellerBroker = null;
    if (title.Attribute("Side").Value == "1") {
        sellerCompany = title.Elements("Pty")
            .Where(pty => pty.Attribute("R").Value == "2")
            .Select(pty => pty.Attribute("ID").Value)
            .FirstOrDefault();
        sellerBroker = title.Elements("Pty")
            .Where(pty => pty.Attribute("R").Value == "1")
            .Select(pty => pty.Element("Sub").Attribute("ID").Value)
            .FirstOrDefault();
    } else if (title.Attribute("Side").Value == "2") {
        buyerCompany = title.Elements("Pty")
            .Where(pty => pty.Attribute("R").Value == "2")
            .Select(pty => pty.Attribute("ID").Value)
            .FirstOrDefault();
        buyerBroker = title.Elements("Pty")
            .Where(pty => pty.Attribute("R").Value == "1")
            .Select(pty => pty.Element("Sub").Attribute("ID").Value)
            .FirstOrDefault();
    }

    var result = new {
        ISBN = book.Attribute("ISBN").Value,
        Genre = book.Attribute("Genre").Value,
        Seller_Company = sellerCompany,
        Seller_Broker = sellerBroker,
        Buyer_Company = buyerCompany,
        Buyer_Broker = buyerBroker,
    };

    results.Add(result);
}

Result:

result

Upvotes: 1

jdweng
jdweng

Reputation: 34421

Try this for complete parsing

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;

namespace ConsoleApplication1
{
    class Program
    {
        const string FILENAME = @"c:\temp\test.xml";
        static void Main(string[] args)
        {
            XDocument doc = XDocument.Load(FILENAME);
            var result = doc.Descendants("Book")
                .Select(b => new
                {
                    ISBN = b.Attribute("ISBN").Value,
                    Genre = b.Attribute("Genre").Value,
                    Side = b.Element("Title").Attribute("Side").Value,
                    ptr = b.Element("Title").Elements("Pty").Select(x => new {
                        R = x.Attribute("R").Value,
                        PtyID = x.Attribute("ID").Value,
                        Typ = x.Elements("Sub").Select(y => y == null ? null : y.Attribute("Typ").Value).FirstOrDefault(),
                        SubIDTyp = x.Elements("Sub").Select(y => y == null ? null : y.Attribute("ID").Value).FirstOrDefault()
                    }).ToList()
                })
                .ToList();
        }
    }
}
​

Upvotes: 0

Adam Prescott
Adam Prescott

Reputation: 945

I think maybe you want to group by ISBN and then selectively get values from the children.

const string xml = 
    @"<?xml version=""1.0"" encoding=""UTF-8"" standalone=""yes""?>    
    <Catalog>
        <Book ISBN=""1.1.1.1"" Genre=""Thriller"">
            <Title  Side=""1"" MMY=""000"">
                <Pty R=""1"" ID=""Seller_ID"">
                    <Sub Typ=""26"" ID=""Seller_Broker"">
                    </Sub>
                </Pty>
                <Pty R=""2"" ID=""Seller_Company"">
                </Pty>
            </Title>
        </Book>
        <Book ISBN=""1.1.1.1"" Genre=""Thriller"">
            <Title  Side=""2"">
                <Pty R=""1"" ID=""Buyer_ID"">
                    <Sub Typ=""26"" ID=""Buyer_Broker"">
                    </Sub>
                </Pty>
                <Pty R=""2"" ID=""Buyer_Company"">
                </Pty>
            </Title>
        </Book>
    </Catalog>";
var doc = XDocument.Parse(xml);
var results = doc.Descendants("Book")
    .GroupBy(x => x.Attribute("ISBN").Value)
    .Select(x => new {
        ISBN = x.Key,
        Genre = x.First().Attribute("Genre").Value,
        PublishDate = x.First().Element("Title").Attribute("MMY").Value,
        BuyerId = x.Where(book => book.Element("Title").Attribute("Side").Value == "2")
            .First()
            .Element("Title")
            .Element("Pty")
            .Attribute("ID").Value
    })
    .ToArray();

Result:

{
    ISBN = "1.1.1.1",
    Genre = "Thriller",
    PublishDate = "000",
    BuyerId = "Buyer_ID"
}

Upvotes: 0

Selman Gen&#231;
Selman Gen&#231;

Reputation: 101681

You can use Parent property to get Parent element of Pty then get the Side attribute and check it:

.Where(e => (int)e.Attribute("R") == 7 && 
            (int)e.Parent.Attribute("Side") == 2)

Upvotes: 2

Related Questions