user3788671
user3788671

Reputation: 2047

Parse XML document (with unusual formatting) C#

What is the best way to parse this unusual xml document?

Portion of the xml:

<?xml version="1.0" encoding="UTF-8"?>
<dataset xmlns="http://developer.cognos.com/schemas/xmldata/1/"
 xmlns:xs="http://www.w3.org/2001/XMLSchema-instance">
    <metadata>
        <item name="AsOfDate" type="xs:string" length="12"/>
        <item name="RateOfReturn" type="xs:double"/>
        <item name="FamAcctIndex" type="xs:string" length="3"/>
        <item name="RowID" type="xs:string" length="1"/>
        <item name="BrM" type="xs:string" length="1"/>
        <item name="ProductLineCode" type="xs:int"/>
    </metadata>
    <data>
        <row>
            <value>Apr 26, 2002</value>
            <value>0.210066429</value>
            <value>JA1</value>
            <value>F</value>
            <value>B</value>
            <value>1</value>
        </row>
        <row>
            <value>Apr 27, 2002</value>
            <value>0.1111111</value>
            <value>BBB</value>
            <value>G</value>
            <value>B</value>
            <value>2</value>
        </row>      
    </data>
</dataset>

When I say unusual xml document I mean that I have never had to parse something with data/rows. This is something I would usually see:

<person gender="female">
  <firstname>Anna</firstname>
  <lastname>Smith</lastname>
</person>

I was going to use:

var xmlDoc = new XmlDocument();
xmlDoc.Load(stream);
//parse here

But figured I would like to know the best way to do this before getting started because it is a very large document.

EDITED:

Is this the best way to do this?

var xml = XElement.Load(@"C:\Users\nunya\Desktop\example.xml").Element(XName.Get("data", "http://developer.cognos.com/schemas/xmldata/1/"));
var row = XName.Get("row", "http://developer.cognos.com/schemas/xmldata/1/");
var value = XName.Get("value", "http://developer.cognos.com/schemas/xmldata/1/");


if (xml != null)
{
    foreach (var rowElement in xml.Elements(row))
    {
        foreach (var valueElement in rowElement.Elements(value))
        {
            //valueElement.Value is what i need
        }
    }
}

Thank you!

Upvotes: 0

Views: 127

Answers (1)

Dan Field
Dan Field

Reputation: 21641

You could just serialize the object to a C# class, assuming you have a schema or can generate a reliable one, but that still makes it complicated to manipulate it. I would create a class that has properties matching the header values. You could try to implement IXmlSerializable on a parent of that class, but I think it'd be more straightforward to just write something that returns a list using XDocument.

The basic problem is figuring out how to line up the column indexes with the row value indexes. I did it using a dictionary and a list:

public class Product
{
    public string AsOfDate { get; set; }
    public double RateOfReturn { get; set; }
    public string FamAcctIndex { get; set; }
    public string RowID { get; set; }
    public string BrM { get; set; }
    public int ProductLineCode { get; set; }
}


public static IEnumerable<Product> ParseDataset(XDocument xd)
{
    XNamespace ns = "http://developer.cognos.com/schemas/xmldata/1/";

    // parse out the column names
    Dictionary<string, int> headerPositions = xd.Root
        .Element(ns + "metadata")
        .Elements()
        .Select((name, idx) => new { pos = idx, name = (string)name.Attribute("name") })
        .ToDictionary(x => x.name, x => x.pos);

    foreach (XElement row in xd.Root.Descendants(ns + "row"))
    {
        List<string> vals = row.Elements().Select(x => x.Value).ToList();
        Product obj = new Product();
        foreach (PropertyInfo prop in typeof(Product).GetProperties())
        {
            string valToSet = vals[headerPositions[prop.Name]];
            prop.SetValue(obj, Convert.ChangeType(valToSet, prop.PropertyType);
        }
        yield return obj;
    }
}

If performance is a concern, you might want to avoid using reflection and just use if/switch on the property names. You could call the function like

XDocument xd = XDocument.Load(...);
List<Product> products = ParseDataset(xd).ToList();

Upvotes: 1

Related Questions