Anthony
Anthony

Reputation: 1714

Remove attributes from a JSON string before deserializing

The web service I'm calling returns a JSON response with much more data than I actually need to use and it's causing the deserialization process to take a very long time.

I'm using VB.NET and the Newtonsoft JSON library.

Using the following JSON as an example, how can I remove all values except the 'id' value?

{"results": [
  {"id":"1234", "name":"name value", "logo":"<some base64 encoded string>"},
  {"id":"1234", "name":"name value", "logo":"<some base64 encoded string>"},
  {"id":"1234", "name":"name value", "logo":"<some base64 encoded string>"},
  {"id":"1234", "name":"name value", "logo":"<some base64 encoded string>"},
  {"id":"1234", "name":"name value", "logo":"<some base64 encoded string>"}
]}

Would regular expressions be the best way?

I have just learnt that it also needs to handle a nested array of objects that also have an id property. The nested id property should be excluded from the final JSON.

{"results": [
  {"id":"1234", "name":"name value", "categories":[{"id":"1","name":"category"}]},
  {"id":"1234", "name":"name value", "categories":[{"id":"1","name":"category"}]},
  {"id":"1234", "name":"name value", "categories":[{"id":"1","name":"category"}]},
  {"id":"1234", "name":"name value", "categories":[{"id":"1","name":"category"}]},
  {"id":"1234", "name":"name value", "categories":[{"id":"1","name":"category"}]}
]}

Upvotes: 1

Views: 2627

Answers (2)

Alex Filipovici
Alex Filipovici

Reputation: 32561

You may use the following expression

(?<="id":")[0-9]*(?=")

to get the IDs and then build your JSON string by using a for / foreach loop and a StringBuilder.

I'll post a sample usage using C#, maybe you could adapt it for VB:

var json = "{\"results\": [" +
"{\"id\":\"1234\", \"name\":\"name value\", \"logo\":\"<some base64 encoded string>\"}," +
"{\"id\":\"1234\", \"name\":\"name value\", \"logo\":\"<some base64 encoded string>\"}," +
"{\"id\":\"1234\", \"name\":\"name value\", \"logo\":\"<some base64 encoded string>\"}," +
"{\"id\":\"1234\", \"name\":\"name value\", \"logo\":\"<some base64 encoded string>\"}," +
"{\"id\":\"1234\", \"name\":\"name value\", \"logo\":\"<some base64 encoded string>\"}" +
"]}";

//try to get matches when JSON contains categories
MatchCollection matches = Regex.Matches(json, "(?<=\"id\":\")[0-9]*(?=\", \"name\":\".*\", \"categories\")");

//if no matches are present (i.e. categories are not included in the JSON)
if(matches.Count==0)
    matches = Regex.Matches(json, "(?<=\"id\":\")[0-9]*(?=\")");

StringBuilder sBuilder = new StringBuilder();
sBuilder.Append("{\"results\": [");

for (int i = 0; i < matches.Count; i++)
{
    sBuilder.Append("{\"id\":\"");
    sBuilder.Append(matches[i].Value);
    if (i == matches.Count - 1)
        break;
    sBuilder.Append("\"},");
}

sBuilder.Append("\"}]}");

//use the JSON string
//sBuilder.ToString();

Upvotes: 2

Vicent
Vicent

Reputation: 5452

As explained here JSON is not a regular language so using regex is not the best way to achieve your goal (although maybe it can be done). It is much better to use your parser for that kind of manipulations.

I think that with JSON.NET, for selecting items, you can do something like:

var ids = response["results"].Children()["id"]

Upvotes: 1

Related Questions