Ivan Chen
Ivan Chen

Reputation: 13

Why JSON can not be read if written with indentation in PYTHON 3.8

With indentation:

a_dict = ({"name": "kevin", "id":100001 })
with open('test.json',"a+") as f:
    json.dump(a_dict, f, indent=4) # Indent makes it more readable
    f.write("\n")
print("done")

Output as below, won't be able to read, says json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes:

{
    "name": "kevin",
    "id": 100001
}
{
    "name": "kevin",
    "id": 100001
}

Without indentation:

a_dict = ({"name": "kevin", "id":100001 })
with open('test.json',"a+") as f:
    json.dump(a_dict, f) # Indent makes it more readable
    f.write("\n")
print("done")

Output that can be read:

{"name": "kevin", "id": 100001}
{"name": "kevin", "id": 100001}
{"name": "kevin", "id": 100001}
{"name": "kevin", "id": 100001}

Decoding:

p_list = []
with open('test.json') as f:
    for json_obj in f:
        test_dict = json.loads(json_obj)
        p_list.append(test_dict)
        # print(test_dict)
for emp in p_list:
    print(emp['name'])

Upvotes: 1

Views: 251

Answers (1)

aneroid
aneroid

Reputation: 15962

Looks like you're trying to use the JSONL / JSON Lines format. The predicate for this format is that each object, or whichever JSON entity, is wholly represented in a single line of the file. So you can't have it be JSONL and indented/prettified at the same time.

When you do:

with open('test.json') as f:
    for json_obj in f:
        ...

each json_obj is just one line of the file, not the JSON object read-in till the end of that object.

If you want to do it that way, you'll need to write your own JSON Decoder that reads in more lines until it's found the end-delimiter and a valid JSON entity. Same goes for writing the file - you'd need to write your own JSON Encoder.

The closest thing to being able to do JSON Lines & Pretty'fied is jq command line tool. And since it's not a Python package, in order to read and write data, use subprocess.run() with capture_output=True.

You can find questions related to this tool on StackOverflow with the tag .


Edit: If you are certain that you will only be writing JSON objects to the file the same way always, you can setup the read to start at a line which starts with { without any spaces/indentation before it and continue reading until you reach a line with } without any spaces/indentation before it.

A rough idea:

with open('test.json') as f:
    parts = []
    in_obj = False
    for some_text in f:
        if some_text == '{' and not in_obj:
            in_obj = True
            parts.append('{')
        elif in_obj:
            parts.append(some_text)
            if some_text == '}':
                in_obj = False
                # put this in a try-except block
                json_obj = json.loads('\n'.join(parts))
                yield json_obj  # or return
                parts = []  # reset
            elif not some_text.startswith(' ' * 4):
                print('error')  # in an object but wrong indent
                # the check above should actually include checking more than
                # just the starting 4 spaces since it could be nested further
        else:
            print('error')  # not in an object and not end delimeter

You'll need to modify that to read multiple objects and be an actual parser.

Also, as noted by @ewen-lbh below, files in this format should have the .jsonl extension. If it's .json you're implying that it holds a single valid loadable json entity.

Upvotes: 2

Related Questions