Reputation: 35
I'm trying to read in a JSON file, determine how many words are in the "text" field, add that information as a new field, "length", and write the new JSON object to a file. I've done that with the following code:
import json
with open("file_read.json", "r") as review_file, open(
"file_write.json", "w") as review_write:
for line in review_file:
review_object = json.loads(line)
review_object["length"] = len(review_object["text"].split())
json.dump(review_object, review_write)
The original file is over 200mb, but I can view it alright with Vim; however, the file I write which is only 3mb larger takes a very long time to load if it loads at all. Furthermore, even if I read only the first JSON object, there are issues. I tried the following after writing the file:
with open("file_write", "r") as review_file:
print review_file.readline()
print("abcd123")
I'm using Vim with python-mode, and when I traverse the first printed statement with the JSON info it is very choppy, but the second statement is not.
Upvotes: 1
Views: 8045
Reputation: 12986
The way you are writing your file, you will have only one HUGE line.
# example
json.dump([1,2,3], fp)
json.dump({"name": "abc"}, fp)
json.dump(33, fp)
# content of file
# [1, 2, 3]{"name": "abc"}33
This may explain why it is so slow to read: it will have to load ~200mb of text in one time. Also loading it as json will probably fail.
To solve it you can use instead:
fp.write(json.dumps(review_object) + "\n")
Upvotes: 5