Reputation: 19097

How can I parse this XML with XmlSerializer attributes?

We need to use a "web service" which communicates extremely ugly XML. It is not developed by us and there is zero chance to make its developers understand proper XML.

FYI, this web service also accepts the same kind of XML in a HTTP GET URL parameter (not the request body) - and the guys who developed it don't understand why that is a bad practice.

So, what is the fastest way to map an XML such as this:

<foo id="document">
    <foo id="customer">
        <bar name="firstname" value="Joe"/>
        <bar name="lastname" value="Smith"/>
        <foo id="address">
            <bar name="city" value="New York"/>
            <bar name="country" value="USA"/>
        </foo>
    </foo>
    <bar name="somemoredata1" value="123"/>
    <bar name="somemoredata2" value="abc"/>
</foo>

into classes like this:

public class Document
{
    public Customer Customer { get; set; }

    public int SomeMoreData1 { get; set; }

    public string SomeMoreData2 { get; set; }
}

public class Customer
{
    public Address Address { get; set; }

    public string FirstName { get; set; }

    public string LastName { get; set; }
}


public class Address
{
    public string City { get; set; }

    public string Country { get; set; }
}

using eg. XML Serializer attributes or any other way that needs as little boilerplate code as possible.

I made up the foo and bar element names, but the structure of the XML I need to parse is based on the exact same convention.

I could of course implement IXmlSerializable manually in these classes or just make Foo and Bar classes and use those with the XmlSerializer, but none of these options seem to be a good solution.

Upvotes: 0

Answers (4)

Anton Tykhyy

Reputation: 20076

You can't do it with XML serializer attributes: there is just no way to make it take a field name out of a specified attribute. You will have to deserialize manually (possibly generating the boilerplate) or pre-process the XML — a simple XSLT along the following lines will do the trick:

<xsl:template match="foo">
  <xsl:element name="{@id}">
    <xsl:apply-templates/>
  </xsl:element>
</xsl:template>

<xsl:template match="bar">
  <xsl:element name="{@name}">
    <xsl:value-of select="@value"/>
  </xsl:element>
</xsl:template>

Update: for the reverse transformation:

<xsl:template match="*[count(child::text())=1]">
  <bar value="{text()}" name="{local-name()}"/>
</xsl:template>

<xsl:template match="*">
  <foo id="{local-name()}">
    <xsl:apply-templates/>
  </foo>
</xsl:template>

Upvotes: 1

Jon Hanna

Reputation: 113322

Since you say in a comment that you are plumping for for XmlSerializer for simplicity rather than because that approach is enforced by other concerns, here's a different approach. Since it seems that the names of the elements is insignificant in the document, I ignore it in the parsing, though one can test that too. With more pleasant XML formats, that would be the main thing that the parsing would key off (generally with with a switch on the element names):

private static Document ParseDocument(XmlReader xr)
{
    Document doc = new Document();
    while(xr.Read())
      if(xr.NodeType == XmlNodeType.Element)
        if(xr.GetAttribute("id") == "customer")
          doc.Customer = ParseCustomer(xr.ReadSubtree());
        else
          switch(xr.GetAttribute("name"))
          {
            case "somemoredata1":
              doc.SomeMoreData1 = int.Parse(xr.GetAttribute("value"));
              break;
            case "somemoredata2":
              doc.SomeMoreData2 = xr.GetAttribute("value");
              break;
          }
      //Put some validation of doc here if necessary.
      return doc;
}
private static Customer ParseCustomer(XmlReader xr)
{
  Customer cu = new Customer();
  while(xr.Read())
    if(xr.NodeType == XmlNodeType.Element)
      if(xr.GetAttribute("id") == "address")
        cu.Address = ParseAddress(xr.ReadSubtree());
      else
        switch(xr.GetAttribute("name"))
        {
          case "firstname":
            cu.FirstName = xr.GetAttribute("value");
            break;
          case "lastname":
            cu.LastName = xr.GetAttribute("value");
            break;
        }
    //validate here if necessary.
    return cu;
}
private static Address ParseAddress(XmlReader xr)
{
  Address add = new Address();
  while(xr.Read())
    if(xr.NodeType == XmlNodeType.Element)
      switch(xr.GetAttribute("name"))
      {
        case "city":
          add.City = xr.GetAttribute("value");
          break;
        case "country":
          add.Country = xr.GetAttribute("value");
          break;
      }
  return add;
}

It's not exactly pretty (it's not terribly pretty with nice XML to work off, but it tends not to be quite as bad), but it works, and the use of subtrees can be nice with some complicated structures where the same type can turn up in different places within the document. One can replace the static methods that set values from the outside with contstructors that take the XmlReader which allows one to ensure class invariants, and/or have the objects immutable.

This approach comes into its own in the case of large documents that you want to deserialise as a large series of the same sort of items (or a large series of just a few types), because one can yield them out as they're created, which can make quite a difference to the delay to first response.

Upvotes: 1

habakuk

Reputation: 2771

You can try the "XML Schema Definition-Tool" ( http://msdn.microsoft.com/de-de/library/x6c1kb0s%28v=vs.80%29.aspx )

Ciao! Stefan

Upvotes: -1

Justin

Reputation: 6559

I haven't used the xml serializer or deserializer myself, but I do use LINQ to parse out my XML doc into objects. If your classes are fairly simply, you may look into that route.

Upvotes: 1

How can I parse this XML with XmlSerializer attributes?

Answers (4)

Related Questions