benito_h
benito_h

Reputation: 550

How to parse a string of multiple jsons without separators in python?

Given a single-lined string of multiple, arbitrary nested json-files without separators, like for example:

contents = r'{"payload":{"device":{"serial":213}}}{"payload":{"device":{"serial":123}}}'

How can contents be parsed into an array of dicts/jsons ? I tried

df = pd.read_json(contents, lines=True)

But only got a ValueError response:

ValueError: Unexpected character found when decoding array value (2)

Upvotes: 2

Views: 420

Answers (2)

Dave Butler
Dave Butler

Reputation: 1823

This has been answered here: https://stackoverflow.com/a/54666028/693869

Here is an example of a generator that could work. I added some comment strings that would cause the accepted answer to break.

import json
from typing import Iterator

contents = r'{"payload":{"device":{"serial":213}},"comment":"spoiler:|hello|"}{"payload":{"device":{"serial":123}},"comment":"Hey look at my strange face: }{"}'


def parse_payloads(s: str) -> Iterator[int]:
    decoder = json.JSONDecoder()
    end = 0
    while end < len(s):
        item, end = decoder.raw_decode(s, end)
        print(item)
        yield item


json_dicts = list(parse_payloads(contents))

print(json_dicts)

Upvotes: 0

RJ Adriaansen
RJ Adriaansen

Reputation: 9619

You can split the string, then parse each JSON string into a dictionary:

import json

contents = r'{"payload":{"device":{"serial":213}}}{"payload":{"device":{"serial":123}}}'

json_strings = contents.replace('}{', '}|{').split('|')
json_dicts = [json.loads(string) for string in json_strings]

Output:

[{'payload': {'device': {'serial': 213}}}, {'payload': {'device': {'serial': 123}}}]

Upvotes: -1

Related Questions