Yaroslav
Yaroslav

Reputation: 97

Error reading big json file due to json.load

I have a problem reading big json file. JSONDecodeError: Extra data: line 1 column 884 (char 883). enter image description here

The files test2.json is here: https://github.com/SilverYar/TransportDataMiner

The error is due to these lines of line code:

import nltk
from nltk.stem.snowball import RussianStemmer
from nltk.corpus import stopwords
import nltk, string, json

with open('C:\\Creme\\token\\test2.json') as fin:
    text = json.load(fin)

I don’t understand how to fix it. Help me fix it.

Upvotes: 0

Views: 1251

Answers (1)

Merelda
Merelda

Reputation: 1338

The content of your json file does not seem to be valid, there are multiple objects but not separated by ",".

For example, a valid json object should be:

[{"title":"some text", "subtitle": "some text"},
 {"title":"some text", "subtitle": "some text"},
{"title":"some text", "subtitle": "some text"}]

A simple hack to read it will be to read in the file and format the string into correct json formats:

with open('test2.json', 'r') as fin:
    text = fin.read()
    formated_text = text.replace('}{', '},{')
    json_data = json.loads(f'[{formated_text}]')

print(len(json_data))
# 11772

Upvotes: 2

Related Questions