nozd
nozd

Reputation: 21

How do prevent NewtonSoft.Json Deserialize() from creating a lot of objects?

I have file with JSON. There are a lot object many of that are the same. But after deserialize a have a lot of identical object that doesn't equal by reference. It's not bad but a have memory limit. So I need that identical objects will be the same by reference.

using (StreamReader file = File.OpenText("filePath"))

using (JsonTextReader reader = new JsonTextReader(file))
{
    reader.SupportMultipleContent = true;
    var serializer = new JsonSerializer();  
    var result = serializer.Deserialize<T>(reader); // After that operation my memory is full
    
    return result;
}

Can I adjust NewtonSoft.Json's Deserialize()?

UPDATE: Example (not real data)

public class Data
{
    public int Version;
    
    public List<Office> Offices;
}

public class Office
{ 
    public int Id;
    public List<Mark> Marks;
}

public class Mark 
{
    public string Name;
}
{
    "version": 1,
    "offices": [
        {
            "id": 1,
            "marks": [
                {
                    "name": "white"
                },
                {
                    "name": "blue"
                }
            ]
        },
        {
            "id": 2,
            "marks": [
                {
                    "name": "white"
                },
                {
                    "name": "green"
                }
            ]
        },
        {
            "id": 3,
            "marks": [
                {
                    "name": "white"
                },
                {
                    "name": "blue"
                }
            ]
        }
    ]
}

Other words, I cann't process "offices" object one by one because there are inside "wrapper" object.

Upvotes: 1

Views: 658

Answers (2)

Alberto Chiesa
Alberto Chiesa

Reputation: 7360

I don't see any Json example in the question, so the answer can only be generic...

You could leverage a custom class deserializer. Something like:

public class MyConverter : JsonConverter
{
    public Dictionary<string, MyObject> _cache;

    /// <summary>Writes the JSON representation of the object.</summary>
    public override void WriteJson(JsonWriter writer, object value, JsonSerializer serializer)
    {
        serializer.Serialize(writer, value);
    }

    /// <summary>Reads the JSON representation of the object.</summary>
    /// <param name="reader">The <see cref="T:Newtonsoft.Json.JsonReader" /> to read from.</param>
    /// <param name="objectType">Type of the object.</param>
    /// <param name="existingValue">The existing value of object being read.</param>
    /// <param name="serializer">The calling serializer.</param>
    /// <returns>The object value.</returns>
    public override object ReadJson(JsonReader reader,
        Type objectType,
        object existingValue,
        JsonSerializer serializer)
    {
        var jObject = JObject.Load(reader);

        // you can check for specific properties in the object that are used to check if its the same object
        if (!jObject.ContainsKey("MyId")) return null;

        var objectId = jObject["MyId"].Value<string>();

        // the object was already deserialized. just return the cached instance.
        // Please note this is NOT performing a check over all the object propertie!
        if (_cache.TryGetValue(taskCode, out var existingInstance))
            return existingInstance;

        // in case of object not found, deserialize and cache.
        return _cache[objectId] = jObject.ToObject<MyObject>();
    }

    /// <summary>
    /// Determines whether this instance can convert the specified object type.
    /// </summary>
    /// <param name="objectType">Type of the object.</param>
    /// <returns>
    ///     <c>true</c> if this instance can convert the specified object type; otherwise, <c>false</c>.
    /// </returns>
    public override bool CanConvert(Type objectType)
        => objectType == typeof(MyObject);
}

In the source class you decorate the (list) property like so:

[JsonProperty(ItemConverterType = typeof(MyConverter))]
public List<MyObject> Objects { get; set; }

Its just a Proof-of-concept, to give you an idea.
There are many possible pitfalls, like how to handle subsequent deserializations, and it really depends upon your specific requirements if/how you can decide two objects are the same, but if you really need to optimize the memory consumption, I think the custom deserialization route could be the way to go, depending upon the complexity of your data structures.

As they say, YMMV :D

Upvotes: 2

Marc Gravell
Marc Gravell

Reputation: 1063824

If the JSON format is allowed to change, you can use reference-tracking with Newtonsoft.Json, via either:

  1. new JsonSerializerSettings { PreserveReferencesHandling = PreserveReferencesHandling.Objects }
  2. [JsonObject(IsReference = true)] on specific types

The deserializer will deal with either automatically.

Emphasis: this changes the JSON fundamentally.


If the JSON cannot be changed, then: from the perspective of the deserializer, they are independent objects, so yes: you'll get lots of them. The fact that they have the same contents is irrelevant, and the deserializer isn't going to constantly check against previous objects to see whether they have the same values. This is for many reasons, including avoiding problems if you do something like:

var obj = Deserialize(path);
obj.Items[3].Name = "Fred"; // change one record
Serialize(obj, path);

If the serializer had unilaterally decided to make all the like objects use the same instance, this one-record change could change other arbitrary data that you didn't expect.

Upvotes: 2

Related Questions