Reputation: 3428
I am trying to retrieve data from an xml document to an array in C# using LINQ where I have to use some nested querying within the elements of the xml data which is as follows
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Catalog>
<Book ISBN="1.1.1.1" Genre="Thriller">
<Title Side="1">
<Pty R="1" ID="Seller_ID">
<Sub Typ="26" ID="John">
</Sub>
</Pty>
<Pty R="2" ID="ABC">
</Pty>
</Title>
</Book>
<Book ISBN="1.2.1.1" Genre="Thriller">
<Title Side="2">
<Pty R="1" ID="Buyer_ID">
<Sub Typ="26" ID="Brook">
</Sub>
</Pty>
<Pty R="2" ID="XYZ">
</Pty>
</Title>
</Book>
</Catalog>
In the above XML document Side="1"
represents a sell side and Side="2"
represents a sell side.
Now, I want to store above elements and attributes in an array which as fields as follows
Array ISBN Genre PublishDate Buyer_Company Seller_Company Buyer_Broker Seller_Broker
I was able to retrieve normal elements and attributes but was not sure how to deal with attributes that are dependent on other elements like
Buyer_Company
Seller_Company
Buyer_Broker
Seller_Broker
which are based on Side, Pty and Sub
elements like Buyer_Company
is ID
attribute of Pty where R= 2 and Side=2
. Similarly, Buyer_Broker
is ID
attribute of Sub
element where its attribute Typ=26
(there can be XML data with different value of Typ
) and Sub
element is already a child to Pty element with R=1
and which is in turn a child of Book
element when Side=2
Code I used to retrieve independent elements is
var result = doc.Descendants("Book")
.Select(b => new
{
ISBN= b.Attribute("ISBN").Value,
Genre=b.Attribute("Genre").Value,
PublishDate= b.Element("Title").Attribute("MMY").Value,
})
.ToArray();
And I worked on querying within a single element as follows
Company= (string)b.Descendants("Pty")
.Where(e => (int)e.Attribute("R") == 7)
.Select(e => e.Attribute("ID"))
.Single()
But this didn't consider the attribute Side
in the element Book
.
Sample Data
First Book Element
ISBN:1.1.1.1
Genre:Thriller
Seller_Company:NULL
Seller_Broker:NULL
Buyer_Company:ABC
Buyer_Broker:John
Second Book Element
ISBN:1.1.1.1
Genre:Thriller
Seller_Company:XYZ
Seller_Broker:Brook
Buyer_Company: NULL
Buyer_Broker:NULL
Side=1 represent a seller side and side=2 represents a buyer side which is why seller side is null in the first element of resultant array and buyer side in second element
May I know a better way to solve this?
Upvotes: 0
Views: 1295
Reputation: 2818
Edited to match the question:
Using XPath:
private static string GetCompanyValue(XElement bookElement, string side, string r)
{
string format = "Title[@Side={0}]/Pty[@R={1}]";
return GetValueByXPath(bookElement, string.Format(format, side, r));
}
private static string GetBrokerValue(XElement bookElement, string side)
{
string format = "Title[@Side={0}]/Pty[@R=1]/Sub[@Typ=26]";
return GetValueByXPath(bookElement, string.Format(format, side));
}
private static string GetValueByXPath(XElement bookElement, string expression)
{
XElement element = bookElement.XPathSelectElement(expression);
return element != null ? element.Attribute("ID").Value : null;
}
And the calling code looks as below.
var result = doc.Descendants("Book")
.Select(book => new
{
ISBN = book.Attribute("ISBN").Value,
Genre = book.Attribute("Genre").Value,
Buyer_Company = GetCompanyValue(book, "2", "2"),
Buyer_Broker = GetBrokerValue(book, "2"),
Seller_Broker = GetBrokerValue(book, "1")
})
.ToArray();
Add a using statement to using System.Xml.XPath;
Upvotes: 2
Reputation: 945
Now that you've provided some examples, I think this will work for you.
const string xml =
@"<?xml version=""1.0"" encoding=""UTF-8"" standalone=""yes""?>
<Catalog>
<Book ISBN=""1.1.1.1"" Genre=""Thriller"">
<Title Side=""1"">
<Pty R=""1"" ID=""Seller_ID"">
<Sub Typ=""26"" ID=""John"">
</Sub>
</Pty>
<Pty R=""2"" ID=""ABC"">
</Pty>
</Title>
</Book>
<Book ISBN=""1.2.1.1"" Genre=""Thriller"">
<Title Side=""2"">
<Pty R=""1"" ID=""Buyer_ID"">
<Sub Typ=""26"" ID=""Brook"">
</Sub>
</Pty>
<Pty R=""2"" ID=""XYZ"">
</Pty>
</Title>
</Book>
</Catalog>";
var doc = XDocument.Parse(xml);
var results = new List<object>();
foreach (var book in doc.Descendants("Book")) {
var title = book.Element("Title");
string buyerCompany = null;
string buyerBroker = null;
string sellerCompany = null;
string sellerBroker = null;
if (title.Attribute("Side").Value == "1") {
sellerCompany = title.Elements("Pty")
.Where(pty => pty.Attribute("R").Value == "2")
.Select(pty => pty.Attribute("ID").Value)
.FirstOrDefault();
sellerBroker = title.Elements("Pty")
.Where(pty => pty.Attribute("R").Value == "1")
.Select(pty => pty.Element("Sub").Attribute("ID").Value)
.FirstOrDefault();
} else if (title.Attribute("Side").Value == "2") {
buyerCompany = title.Elements("Pty")
.Where(pty => pty.Attribute("R").Value == "2")
.Select(pty => pty.Attribute("ID").Value)
.FirstOrDefault();
buyerBroker = title.Elements("Pty")
.Where(pty => pty.Attribute("R").Value == "1")
.Select(pty => pty.Element("Sub").Attribute("ID").Value)
.FirstOrDefault();
}
var result = new {
ISBN = book.Attribute("ISBN").Value,
Genre = book.Attribute("Genre").Value,
Seller_Company = sellerCompany,
Seller_Broker = sellerBroker,
Buyer_Company = buyerCompany,
Buyer_Broker = buyerBroker,
};
results.Add(result);
}
Result:
Upvotes: 1
Reputation: 34421
Try this for complete parsing
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
namespace ConsoleApplication1
{
class Program
{
const string FILENAME = @"c:\temp\test.xml";
static void Main(string[] args)
{
XDocument doc = XDocument.Load(FILENAME);
var result = doc.Descendants("Book")
.Select(b => new
{
ISBN = b.Attribute("ISBN").Value,
Genre = b.Attribute("Genre").Value,
Side = b.Element("Title").Attribute("Side").Value,
ptr = b.Element("Title").Elements("Pty").Select(x => new {
R = x.Attribute("R").Value,
PtyID = x.Attribute("ID").Value,
Typ = x.Elements("Sub").Select(y => y == null ? null : y.Attribute("Typ").Value).FirstOrDefault(),
SubIDTyp = x.Elements("Sub").Select(y => y == null ? null : y.Attribute("ID").Value).FirstOrDefault()
}).ToList()
})
.ToList();
}
}
}
Upvotes: 0
Reputation: 945
I think maybe you want to group by ISBN and then selectively get values from the children.
const string xml =
@"<?xml version=""1.0"" encoding=""UTF-8"" standalone=""yes""?>
<Catalog>
<Book ISBN=""1.1.1.1"" Genre=""Thriller"">
<Title Side=""1"" MMY=""000"">
<Pty R=""1"" ID=""Seller_ID"">
<Sub Typ=""26"" ID=""Seller_Broker"">
</Sub>
</Pty>
<Pty R=""2"" ID=""Seller_Company"">
</Pty>
</Title>
</Book>
<Book ISBN=""1.1.1.1"" Genre=""Thriller"">
<Title Side=""2"">
<Pty R=""1"" ID=""Buyer_ID"">
<Sub Typ=""26"" ID=""Buyer_Broker"">
</Sub>
</Pty>
<Pty R=""2"" ID=""Buyer_Company"">
</Pty>
</Title>
</Book>
</Catalog>";
var doc = XDocument.Parse(xml);
var results = doc.Descendants("Book")
.GroupBy(x => x.Attribute("ISBN").Value)
.Select(x => new {
ISBN = x.Key,
Genre = x.First().Attribute("Genre").Value,
PublishDate = x.First().Element("Title").Attribute("MMY").Value,
BuyerId = x.Where(book => book.Element("Title").Attribute("Side").Value == "2")
.First()
.Element("Title")
.Element("Pty")
.Attribute("ID").Value
})
.ToArray();
Result:
{
ISBN = "1.1.1.1",
Genre = "Thriller",
PublishDate = "000",
BuyerId = "Buyer_ID"
}
Upvotes: 0
Reputation: 101681
You can use Parent
property to get Parent
element of Pty
then get the Side
attribute and check it:
.Where(e => (int)e.Attribute("R") == 7 &&
(int)e.Parent.Attribute("Side") == 2)
Upvotes: 2