homeboy
homeboy

Reputation: 171

Parse incorrect JSON

I use JSON to send data to websocket. Sometimes websocket recive many messages as one, and event.data looks like:

{"message1":"message1"}{"message2":"message2"}

so i can't parse it with JSON.Parse. How to handle this problem?

Upvotes: 2

Views: 1845

Answers (3)

georg
georg

Reputation: 214959

Here's an example of an auto-recovering JSON parser, which you can use to parse concatenated jsons:

function *multiJson(str) {
    while (str) {
        try {
            yield JSON.parse(str);
            str = '';
        } catch(e) {
            var m = String(e).match(/position\s+(\d+)/);
            yield JSON.parse(str.slice(0, m[1]));
            str = str.slice(m[1]);
        }
    }
}

//

let test = '{"message1":"message1"}{"message2":{"nested":"hi}{there"}}"third"[4,5,6]';

for (let x of multiJson(test))
    console.log(x)

Basically, if there's a syntax error at position n, it tries to parse out everything before n and what's after it.

Upvotes: 5

lhl
lhl

Reputation: 26

If you have any control over the API then I would strongly recommend that you have it fixed there. However if you don't the please proceed reading.

I assume that looking for "}" is not really an option since you could have nested objects and the } character might be inside a string and so on.

A quick and easy way would be to try parse the string starting with 1 character and adding characters one by one until the JSON parser does not fail. That is when you will have your first chunk of data parsed.

Move the offset to the end of the successfully parsed data and repeat.

Might not be an elegant solution or very efficient one but then again you have a non standard data format.

Upvotes: 1

rsp
rsp

Reputation: 111366

If you cannot fix it on the sending side and it always looks like this, then you might try to fix it and replace '}{' with '}\n{', split on newlines and have an array of JSON strings.

var array = input.replace('}{', '}\n{').split('\n');

Note that if your input contains newlines then you have to use another character or string:

var array = input.replace('}{', '}==XXX=={').split('==XXX==');

but it relies on fact that you don't have '}{' anywhere else in the string, which may not be true.

A more correct way, but harder, would be to count { and } that are not inside of strings, and when you get the same number of } as the number of { then split the string there.

What you would have to do is go character by character and keep track of whether you are inside quotes or not, make every { increment a counter, } decrement a counter and split your input whenever your counter hits zero.

Another hacky way would be to try to split the string on every possible } and try to parse the substring as JSON and if it's valid then use it and remove from the input.

Upvotes: 1

Related Questions