Viv
Viv

Reputation: 1584

Error in Parsing a multiline json object in python

Trying to parse a multiline, multiple json objects separated by comma in Python. However, in whichever mode either json.load or as a list or as jsonlines object, It fails to parse the data.

Input: This is present in a single file in below fashion

{
    "0": "mdm-898040540420",
    "1": {
        "dchannel": "FR al"
    },
    "2": {
        "dchannel": "FR Website"
    },
    "3": {
        "dcountry": "BDF"
    }
},
{
    "0": "mdm-846290540037",
    "1": {
        "dchannel": "FR alk"
    },
    "2": {
        "dchannel": "FR Website"
    },
    "3": {
        "dcountry": "BDF"
    }
},......

And so on, Its like many small json objects in a file.

Tried to enclose the whole file with a [] like - [{json1},{json2}...] and use

with open("C:\\Users\\viv\\Downloads\\2020_11_21-10_31_03_PM_v2.json", 'r') as f:
    object_list = []
    for line in f.readlines():
        object_list.append(json.loads(line))

AND without using []., enclosed whole in {} and used json library. In whichever method, it would fail to parse.

Any method to parse it would be deeply appreciated. I want to produce a csv as output where :

id               dchannel             dcountry
mdm-846290540037,"FR al, FR Website", BDF 

Error Messages :

1. While trying 
df = pd.read_json("C:\\Users\\viv\\Downloads\\2020_11_21-10_31_03_PM_v2.json", lines=True)
df.head()

    self._parse_no_numpy()
  File "D:\workspace\BillingDashboard\venv\lib\site-packages\pandas\io\json\_json.py", line 1093, in _parse_no_numpy
    loads(json, precise_float=self.precise_float), dtype=None
ValueError: Expected object or value


2. while running : 

entitiesList = []
print("Started Reading JSON file which contains multiple JSON document")
with open("C:\\Users\\viv\\Downloads\\edited_file_b.json",'r') as f:
    for jsonObj in f:
        entitiesDict = json.loads(jsonObj)
        entitiesList.append(entitiesDict)




  File "D:/workspace/BillingDashboard/bsdf_json_csv_converter.py", line 12, in <module>
    entitiesDict = json.loads(jsonObj)
  File "C:\python37\lib\json\__init__.py", line 348, in loads
    return _default_decoder.decode(s)
  File "C:\python37\lib\json\decoder.py", line 337, in decode
Started Reading JSON file which contains multiple JSON document
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "C:\python37\lib\json\decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 2 column 1 (char 2)

Process finished with exit code 1

Upvotes: 0

Views: 323

Answers (2)

PIG208
PIG208

Reputation: 2370

The problem might go with the format of of json data.

For instance, if the original json looks like this:

{"0": "mdm-898040540420",
    "1": {
        "dchannel": "FR al"
    },
    "2": {
        "dchannel": "FR Website"
    },
    "3": {
        "dcountry": "BDF"
    }
},
{
    "0": "mdm-846290540037",
    "1": {
        "dchannel": "FR alk"
    },
    "2": {
        "dchannel": "FR Website"
    },
    "3": {
        "dcountry": "BDF"
    }
}

You could try to surround it by {"test": []} and parse it with json.loads(text) (I feel that the parser doesn't matter much in your case).

{"test":
    [{"0": "mdm-898040540420",
        "1": {
            "dchannel": "FR al"
        },
        "2": {
            "dchannel": "FR Website"
        },
        "3": {
            "dcountry": "BDF"
        }
    },
    {
        "0": "mdm-846290540037",
        "1": {
            "dchannel": "FR alk"
        },
        "2": {
            "dchannel": "FR Website"
        },
        "3": {
            "dcountry": "BDF"
        }
    }]
}

The following should work:

with open('./jsonpath.json', 'r') as f:
    data = json.loads(f.read())
print(data)

Upvotes: 1

skarit
skarit

Reputation: 11

Why don't you try with pandas?

from pandas import read_json
df = read_json("C:\\Users\\viv\\Downloads\\2020_11_21-10_31_03_PM_v2.json")
df.to_csv('save_path/file_name.csv')

Try using diferent values in orient parameter of read_json.

Upvotes: 0

Related Questions