Philippe Cerfon
Philippe Cerfon

Reputation: 19

How to parse a JSON from a file and record lines/colums each parsed object had in the file?

Assuming I parse a JSON file like:

{ "foo":  [ "bar", "baz" ] }

in Python via something like:

with open("example.json") as f:
    j = json.load(f)

I'd want a way to get the line number and position of e.g. j["foo"][0] (which is `"bar") in the text file.

What line number is, is probably obvious (for the example it would be line 2). Position is at least the column where the object starts in the JSON file, but ideally also the last column of it (for the example it would be columns 11-15).

Ideally all this would somehow work with the standard library json and not add some other 3rd party JSON parser as dependency.

The motivation for all this is to give the user feedback, where exactly the syntax o some JSON file failed, and with that I don't mean the syntax of JSON itself, but if my JSON is e.g. defined in a way so that j["foo"][0] must be string, but isn't.

Looked at several other JSON decoders for Python, but as far as I could see, none supported this.

Upvotes: 1

Views: 79

Answers (2)

blhsing
blhsing

Reputation: 107015

One approach to maximizing code reuse is to take advantage of the fact that the entire JSON syntax is valid in Python by parsing a given JSON object with ast.parse, and recursively transforming the AST nodes into nested lists and/or dicts until you reach Constant nodes, over which you can return a proxy object to allow access to the AST node's location attributes:

import ast

NAME_MAPPING = {'true': True, 'false': False, 'null': None}
def json_ast(json):
    def _json_ast(node):
        if isinstance(node, ast.List):
            return list(map(_json_ast, node.elts))
        if isinstance(node, ast.Dict):
            return dict(zip(map(_json_ast, node.keys), map(_json_ast, node.values)))
        if isinstance(node, ast.Name):
            return NAME_MAPPING[node.id]
        class _Constant(type(node.value)):
            def __getattr__(self, name):
                return getattr(node, name)
        return _Constant(node.value)
    return _json_ast(ast.parse(json).body[0].value)

so that:

json = '''\
{ "foo": [
          "bar",
          "baz"
         ]
}'''
data = json_ast(json)
value = data["foo"][0]
print(data)
print(value)
print(value.lineno)
print(value.end_lineno)
print(value.col_offset)
print(value.end_col_offset)

outputs:

{'foo': ['bar', 'baz']}
bar
2
2
10
15

Demo: https://ideone.com/d5cAQo

Upvotes: 5

Yoshandan
Yoshandan

Reputation: 1

I hope this will help you

import json

with open("stack_js.json", "r") as f:
    data = f.readlines()

for i, content in enumerate(data):
   print("line no: ", i, content)

Output:

...
line no:  1           "bar",
line no:  2           "baz"
...

Upvotes: -4

Related Questions