Reputation: 97
I have a problem reading big json file. JSONDecodeError: Extra data: line 1 column 884 (char 883).
The files test2.json is here: https://github.com/SilverYar/TransportDataMiner
The error is due to these lines of line code:
import nltk
from nltk.stem.snowball import RussianStemmer
from nltk.corpus import stopwords
import nltk, string, json
with open('C:\\Creme\\token\\test2.json') as fin:
text = json.load(fin)
I don’t understand how to fix it. Help me fix it.
Upvotes: 0
Views: 1251
Reputation: 1338
The content of your json file does not seem to be valid, there are multiple objects but not separated by ",".
For example, a valid json object should be:
[{"title":"some text", "subtitle": "some text"},
{"title":"some text", "subtitle": "some text"},
{"title":"some text", "subtitle": "some text"}]
A simple hack to read it will be to read in the file and format the string into correct json formats:
with open('test2.json', 'r') as fin:
text = fin.read()
formated_text = text.replace('}{', '},{')
json_data = json.loads(f'[{formated_text}]')
print(len(json_data))
# 11772
Upvotes: 2