Animesh D
Animesh D

Reputation: 5002

Recursive JSON parsing using JSON.NET

I have the following JSON object returned from a JSON differ:

{
    "lastName": ["Bab", "Beb"],
    "middleName": ["Cg", "seeg"],
    "contact":
    {
        "emailAddress": ["[email protected]", "[email protected]"],
        "addresses":
        [
            {
                "state": ["AL", "AZ"]
            },
            {
                "state": ["TN", "MO"]
            }
        ]
    }
}

I need a list of changes in the following fashion.

lastName/new:Bab/old:Beb
middleName/new:Cg/old:seeg
contact.emailAddress/new:[email protected]/old:[email protected]
contact.addresses[0].state/new:AL/old:AZ
contact.addresses[1].state/new:TN/old:MO

So I wrote this ugly program using a bit of recursion.

private static IEnumerable<DocumentProperty> ParseJObject(JObject node)
{
    HashSet<DocumentProperty> documentProperties = new HashSet<DocumentProperty>();
    DocumentProperty documentProperty = new DocumentProperty();

    foreach (KeyValuePair<string, JToken> sub in node)
    {
        if (sub.Value.Type == JTokenType.Array)
        {
            // unnamed nodes which contain nested objects 
            if (sub.Value.First.Type == JTokenType.Object)
            {
                foreach (var innerNode in sub.Value.Children())
                {
                    documentProperties.UnionWith(ParseJObject((JObject)innerNode));
                }
            }

            documentProperty = CreateDocumentProperty(sub.Value);
        }
        else if (sub.Value.Type == JTokenType.Object)
        {
            documentProperties.UnionWith(ParseJObject((JObject)sub.Value));
        }

        documentProperties.Add(documentProperty);
    }

    return documentProperties;
}

It worked except that it is getting me some extra output.

lastName/new:Bab/old:Beb
middleName/new:Cg/old:seeg
contact.emailAddress/new:[email protected]/old:[email protected]
contact.addresses[0].state/new:AL/old:AZ
contact.addresses[1].state/new:TN/old:MO
contact.addresses/new:{                <-----------------------------Extra here.
  "state": [
    "AL",
    "AZ"
  ]
}/old:{
  "state": [
    "TN",
    "MO"
  ]
}

I suspect that it is due to how I have my recursion setup. Can you immediately make out what is wrong here?

Definition for CreateDocumentProperty

private static DocumentProperty CreateDocumentProperty(JToken subValue) => new DocumentProperty()
{
    PropertyName = subValue.Path,
    New = subValue[0].ToString(),
    Old = subValue[1].ToString()
};

Main method:

static void Main()
{
    JToken jToken = JToken.Parse("{\"lastName\":[\"Bab\",\"Beb\"],\"middleName\":[\"Cg\",\"seeg\"],\"contact\":{\"emailAddress\":[\"[email protected]\",\"[email protected]\"],\"addresses\":[{\"state\":[\"AL\",\"AZ\"]},{\"state\":[\"TN\",\"MO\"]}],}}");

    JObject inner = jToken.Value<JObject>();
    IEnumerable<DocumentProperty> data = ParseJObject(inner);

    foreach (var item in data) Console.WriteLine(item);
}

Upvotes: 1

Views: 1764

Answers (1)

dbc
dbc

Reputation: 117154

Rather than writing your own recursive code, you can use JContainer.DescendantsAndSelf() to find all new value/old value pairs, then transform then into a string with the required formatting using LINQ.

First, define the following extension method:

public static IEnumerable<string> GetDiffPaths(this JContainer root)
{
    if (root == null)
        throw new ArgumentNullException(nameof(root));
    var query = from array in root.DescendantsAndSelf().OfType<JArray>()
                where array.Count == 2 && array[0] is JValue && array[1] is JValue
                select $"{array.Path}/new:{array[0]}/old:{array[1]}";
    return query;
}

And then do:

var jContainer = jToken as JContainer;
if (jContainer == null)
    throw new JsonException("Input was not a container");

foreach (var item in jContainer.GetDiffPaths())
{
    Console.WriteLine(item);
}

Demo fiddle here.

Notes:

  1. In the above code I am simply generating an enumerable of strings, but you could replace that with an enumerable of DocumentProperty objects (which was not fully included in your question).

  2. My assumption is that any JSON array containing two exactly two primitive values represents a new value / old value pair.

    In your code checking for this isn't done correctly. Specifically, I believe at the minimum an else is missing in the following location:

    foreach (KeyValuePair<string, JToken> sub in node)
    {
        if (sub.Value.Type == JTokenType.Array)
        {
            // unnamed nodes which contain nested objects 
            if (sub.Value.First.Type == JTokenType.Object)
            {
                foreach (var innerNode in sub.Value.Children())
                {
                    documentProperties.UnionWith(ParseJObject((JObject)innerNode));
                }
            }
            else // ELSE WAS REQUIRED HERE
            {
                documentProperties.Add(CreateDocumentProperty(sub.Value));
            }
        }
        else if (sub.Value.Type == JTokenType.Object)
        {
            documentProperties.UnionWith(ParseJObject((JObject)sub.Value));
        }
    }
    

    Demo fiddle #2 here.

  3. JContainer represents either a JSON array or a JSON object. My assumption is that the diff routine must return one or the other.

Upvotes: 1

Related Questions