XML Deserialization of collection property with code defaults

Question

For application configuration, I frequently will create a configuration class with configuration values for the application that I then deserialize into an object to utilize. The configuration object is usually databound to a user interface control so that the configuration can be changed and persisted by the user. The configuration class typically has default values assigned to the properties so that there is always a default configuration. This has worked well. I recently had a situation where I had a list of strings that provided some default path information. And what I saw made me realize I did not completely know how the object properties are being populated during XML deserialization to an object.

So I created a simple example to show the behavior. The following is a simple class that has a couple of properties that have some code defaults.

[Serializable]
public class TestConfiguration
   {
      public String Name 
      { 
         get
         {
            return mName;
         }
         set
         {
            mName = value;
         }
      }private String mName = "Pete Sebeck";

  public List Associates 
  { 
     get
     {
        return mAssociates;
     }
     set
     {
        mAssociates = value;
     }
  } private List mAssociates = new List() { "Jon", "Natalie" };

  public override String ToString()
  {
     StringBuilder buffer = new StringBuilder();
     buffer.AppendLine(String.Format("Name: {0}", Name));
     buffer.AppendLine("Associates:");
     foreach(String associate in mAssociates)
     {
        buffer.AppendLine(String.Format("	{0}", associate));
     }
     return buffer.ToString();
  }
   }

And here is a main that creates a new objects, prints the state of the object to the console, serializes (xml) it to a file, the reconstitutes an object from that file and again prints the state of the object to the console. What I expected was an object that matched what was serialized. What I got was the default object with contents of the serialized list added to the default.

  static void Main(string[] args)
  {
     // Create a default object
     TestConfiguration configuration = new TestConfiguration();
     Console.WriteLine(configuration.ToString());

     // Serialize the object
     XmlSerializer writer = new XmlSerializer(typeof(TestConfiguration));
     StreamWriter filewriter = new StreamWriter("TestConfiguration.xml");
     writer.Serialize(filewriter, configuration);
     filewriter.Close();

     // Now deserialize the xml into another object
     XmlSerializer reader = new XmlSerializer(typeof(TestConfiguration));
     StreamReader filereader = new StreamReader("TestConfiguration.xml");
     TestConfiguration deserializedconfiguration = (TestConfiguration)reader.Deserialize(filereader);
     filereader.Close();

     Console.WriteLine(deserializedconfiguration.ToString());

     Console.ReadLine();
      }

Results:

Name: Pete Sebeck
Associates:
        Jon
        Natalie

Name: Pete Sebeck
Associates:
        Jon
        Natalie
        Jon
        Natalie

I guess I always thought the List property would be set rather than appended to. Does anyone have a pointer to the deserialization process for collections? I apparently do now know the correct search terms as my attempts are coming up empty. I see other posts describing what I am seeing and their approach of implementing serialization themselves. I am more looking for a pointer that describes what happens when a collection is deserialized so I can explain to myself what I am seeing.

dbc · Accepted Answer

You are correct that many serializers (though not all) work this way. Json.NET does, its JsonConverter.ReadJson method actually has an Object existingValue for exactly this situation.

I don't know of any documents where these sorts of implementation details are spelled out. The easiest way to determine whether a serializer uses pre-allocated collections when present rather than unconditionally allocating and then setting one itself is to actually test it by using an ObservableCollection and attaching debug listeners when it is changed:

[Serializable]
[DataContract]
public class TestConfiguration
{
    [DataMember]
    public String Name { get { return mName; } set { mName = value; } }

    private String mName = "Pete Sebeck";

    [DataMember]
    public ObservableCollection Associates
    {
        get
        {
            Debug.WriteLine(mAssociates == null ? "Associates gotten, null value" : "Associates gotten, count = " + mAssociates.Count.ToString());
            return mAssociates;
        }
        set
        {
            Debug.WriteLine(value == null ? "Associates set to a null value" : "Associates set, count = " + value.Count.ToString());
            RemoveListeners(mAssociates);
            mAssociates = AddListeners(value);
        }
    }

    private ObservableCollection mAssociates = AddListeners(new ObservableCollection() { "Jon", "Natalie" });

    public override String ToString()
    {
        StringBuilder buffer = new StringBuilder();
        buffer.AppendLine(String.Format("Name: {0}", Name));
        buffer.AppendLine("Associates:");
        foreach (String associate in mAssociates)
        {
            buffer.AppendLine(String.Format("	{0}", associate));
        }
        return buffer.ToString();
    }

    static ObservableCollection AddListeners(ObservableCollection list)
    {
        if (list != null)
        {
            list.CollectionChanged -= list_CollectionChanged; // In case it was already there.
            list.CollectionChanged += list_CollectionChanged;
        }
        return list;
    }

    static ObservableCollection RemoveListeners(ObservableCollection list)
    {
        if (list != null)
        {
            list.CollectionChanged -= list_CollectionChanged; // In case it was already there.
        }
        return list;
    }

    public static ValueWrapper ShowDebugInformation = new ValueWrapper(false);

    static void list_CollectionChanged(object sender, NotifyCollectionChangedEventArgs e)
    {
        if (!ShowDebugInformation)
            return;
        switch (e.Action)
        {
            case NotifyCollectionChangedAction.Add:
                Debug.WriteLine(string.Format("Added {0} items", e.NewItems.Count));
                break;
            case NotifyCollectionChangedAction.Move:
                Debug.WriteLine("Moved items");
                break;
            case NotifyCollectionChangedAction.Remove:
                Debug.WriteLine(string.Format("Removed {0} items", e.OldItems.Count));
                break;
            case NotifyCollectionChangedAction.Replace:
                Debug.WriteLine("Replaced items");
                break;
            case NotifyCollectionChangedAction.Reset:
                Debug.WriteLine("Reset collection");
                break;
        }
    }
}

public static class TestTestConfiguration
{
    public static void Test()
    {
        var test = new TestConfiguration();

        Debug.WriteLine("
Testing Xmlserializer...");
        var xml = XmlSerializationHelper.GetXml(test);
        using (new SetValue(TestConfiguration.ShowDebugInformation, true))
        {
            var testFromXml = XmlSerializationHelper.LoadFromXML(xml);
            Debug.WriteLine("XmlSerializer result: " + testFromXml.ToString());
        }

        Debug.WriteLine("
Testing Json.NET...");
        var json = JsonConvert.SerializeObject(test, Formatting.Indented);
        using (new SetValue(TestConfiguration.ShowDebugInformation, true))
        {
            var testFromJson = JsonConvert.DeserializeObject(json);
            Debug.WriteLine("Json.NET result: " + testFromJson.ToString());
        }

        Debug.WriteLine("
Testing DataContractSerializer...");
        var contractXml = DataContractSerializerHelper.GetXml(test);
        using (new SetValue(TestConfiguration.ShowDebugInformation, true))
        {
            var testFromContractXml = DataContractSerializerHelper.LoadFromXML(contractXml);
            Debug.WriteLine("DataContractSerializer result: " + testFromContractXml.ToString());
        }

        Debug.WriteLine("
Testing BinaryFormatter...");
        var binary = BinaryFormatterHelper.ToBase64String(test);
        using (new SetValue(TestConfiguration.ShowDebugInformation, true))
        {
            var testFromBinary = BinaryFormatterHelper.FromBase64String(binary);
            Debug.WriteLine("BinaryFormatter result: " + testFromBinary.ToString());
        }

        Debug.WriteLine("
Testing JavaScriptSerializer...");
        var javaScript = new JavaScriptSerializer().Serialize(test);
        using (new SetValue(TestConfiguration.ShowDebugInformation, true))
        {
            var testFromJavaScript = new JavaScriptSerializer().Deserialize(javaScript);
            Debug.WriteLine("JavaScriptSerializer result: " + testFromJavaScript.ToString());
        }
    }
}

I ran the test above, and found:

XmlSerializer and Json.NET use the pre-existing collection if present. (In Json.NET this can be controlled by setting JsonSerializerSettings.ObjectCreationHandling to Replace)
JavaScriptSerializer, BinaryFormatter and DataContractSerializer do not, and always allocate the collection themselves. For the latter two this is not surprising as both do not call default constructors and instead simply allocate empty memory directly.

I don't know why the serializers in case 1 behave this way. Perhaps their authors were concerned that the containing class might want to internally use a subclass of the collection being deserialized, or attach observers to observable collections as I have done, and so decided to honor that design?

One note - for all serializers (except, maybe, BinaryFormatter, about which I am unsure), if a collection property is declared specifically as an array then the serializer will allocate the array itself and set the array after it is fully populated. This means that arrays can always be used as proxy collections during serialization.

By using a proxy array, you can guarantee that your collection is overwritten during deserialization:

    [IgnoreDataMember]
    [XmlIgnore]
    [ScriptIgnore]
    public ObservableCollection { get; set; } // Or List or etc.

    [XmlArray("Associates")]
    [DataMember(Name="Associates")]
    public string[] AssociateArray
    {
        get
        {
            return (Associates == null ? null : Associates.ToArray());
        }
        set
        {
            if (Associates == null)
                Associates = new ObservableCollection();
            Associates.Clear();
            if (value != null)
                foreach (var item in value)
                    Associates.Add(item);
        }
    }

Now the collection comes back with only the previously serialized members with all 5 serializers.

XML Deserialization of collection property with code defaults

Answers (1)

Related Questions