Reputation: 2047
What is the best way to parse this unusual xml document?
Portion of the xml:
<?xml version="1.0" encoding="UTF-8"?>
<dataset xmlns="http://developer.cognos.com/schemas/xmldata/1/"
xmlns:xs="http://www.w3.org/2001/XMLSchema-instance">
<metadata>
<item name="AsOfDate" type="xs:string" length="12"/>
<item name="RateOfReturn" type="xs:double"/>
<item name="FamAcctIndex" type="xs:string" length="3"/>
<item name="RowID" type="xs:string" length="1"/>
<item name="BrM" type="xs:string" length="1"/>
<item name="ProductLineCode" type="xs:int"/>
</metadata>
<data>
<row>
<value>Apr 26, 2002</value>
<value>0.210066429</value>
<value>JA1</value>
<value>F</value>
<value>B</value>
<value>1</value>
</row>
<row>
<value>Apr 27, 2002</value>
<value>0.1111111</value>
<value>BBB</value>
<value>G</value>
<value>B</value>
<value>2</value>
</row>
</data>
</dataset>
When I say unusual xml document I mean that I have never had to parse something with data/rows. This is something I would usually see:
<person gender="female">
<firstname>Anna</firstname>
<lastname>Smith</lastname>
</person>
I was going to use:
var xmlDoc = new XmlDocument();
xmlDoc.Load(stream);
//parse here
But figured I would like to know the best way to do this before getting started because it is a very large document.
EDITED:
Is this the best way to do this?
var xml = XElement.Load(@"C:\Users\nunya\Desktop\example.xml").Element(XName.Get("data", "http://developer.cognos.com/schemas/xmldata/1/"));
var row = XName.Get("row", "http://developer.cognos.com/schemas/xmldata/1/");
var value = XName.Get("value", "http://developer.cognos.com/schemas/xmldata/1/");
if (xml != null)
{
foreach (var rowElement in xml.Elements(row))
{
foreach (var valueElement in rowElement.Elements(value))
{
//valueElement.Value is what i need
}
}
}
Thank you!
Upvotes: 0
Views: 127
Reputation: 21641
You could just serialize the object to a C# class, assuming you have a schema or can generate a reliable one, but that still makes it complicated to manipulate it. I would create a class that has properties matching the header values. You could try to implement IXmlSerializable
on a parent of that class, but I think it'd be more straightforward to just write something that returns a list using XDocument
.
The basic problem is figuring out how to line up the column indexes with the row value indexes. I did it using a dictionary and a list:
public class Product
{
public string AsOfDate { get; set; }
public double RateOfReturn { get; set; }
public string FamAcctIndex { get; set; }
public string RowID { get; set; }
public string BrM { get; set; }
public int ProductLineCode { get; set; }
}
public static IEnumerable<Product> ParseDataset(XDocument xd)
{
XNamespace ns = "http://developer.cognos.com/schemas/xmldata/1/";
// parse out the column names
Dictionary<string, int> headerPositions = xd.Root
.Element(ns + "metadata")
.Elements()
.Select((name, idx) => new { pos = idx, name = (string)name.Attribute("name") })
.ToDictionary(x => x.name, x => x.pos);
foreach (XElement row in xd.Root.Descendants(ns + "row"))
{
List<string> vals = row.Elements().Select(x => x.Value).ToList();
Product obj = new Product();
foreach (PropertyInfo prop in typeof(Product).GetProperties())
{
string valToSet = vals[headerPositions[prop.Name]];
prop.SetValue(obj, Convert.ChangeType(valToSet, prop.PropertyType);
}
yield return obj;
}
}
If performance is a concern, you might want to avoid using reflection and just use if/switch on the property names. You could call the function like
XDocument xd = XDocument.Load(...);
List<Product> products = ParseDataset(xd).ToList();
Upvotes: 1