Reputation: 1713
I have a text file that have json objects in below format:
{owner:<value>, data:<value>}
{owner:<value>, data:<value>}
{owner:<value>, data:<value>}
{owner:<value>, data:<value>}
{owner:<value>, data:<value>}
Note that they are separated only by space. However, I know thay are all valid json objects containing the same keys.
How can I read this file in python and converting the file to a valid json file?
Note that the file is pretty big so I would need to read by streams.
Upvotes: 0
Views: 1027
Reputation: 15128
One way to do this would be to read, process, and write each line of the output file as you go along.
with open("input.data") as infile, open("output.json", "w") as outfile:
first_line = True
outfile.write("[")
for line in infile:
data = ""
if first_line:
first_line = False
else:
data += ","
data += "\n " + line.strip()
outfile.write(data)
outfile.write("\n]")
As an example, I tested this method in the python shell:
>>> def write_lines_to_json(input_lines, indent=" "):
yield "["
first_line = True
for line in input_lines:
data = ""
if first_line:
first_line = False
else:
data += ","
data += "\n" + indent + line.strip()
yield data
yield "\n]"
# This is the contents of the input file
>>> with open("/tmp/input.data") as fobj:
print(fobj.read())
{"owner": "a", "data": 0}
{"owner": "b", "data": 1}
{"owner": "c", "data": 2}
{"owner": "d", "data": 3}
{"owner": "e", "data": 4}
{"owner": "f", "data": 5}
{"owner": "g", "data": 6}
{"owner": "h", "data": 7}
{"owner": "i", "data": 8}
{"owner": "j", "data": 9}
>>> with open("/tmp/input.data") as infile, open("/tmp/output.json", "w") as outfile:
for data in write_lines_to_json(infile):
outfile.write(data)
# This is the contents of the output json file
>>> with open("/tmp/output.json") as fobj:
print(fobj.read())
[
{"owner": "a", "data": 0},
{"owner": "b", "data": 1},
{"owner": "c", "data": 2},
{"owner": "d", "data": 3},
{"owner": "e", "data": 4},
{"owner": "f", "data": 5},
{"owner": "g", "data": 6},
{"owner": "h", "data": 7},
{"owner": "i", "data": 8},
{"owner": "j", "data": 9}
]
Upvotes: 1
Reputation: 1886
Try this, it will get a list with each json line:
import json
d = []
with open("file.txt") as f:
lines = f.readlines()
for line in lines:
d.append(json.loads(line))
OUTPUT (should be)
d = [{owner:<value>, data:<value>},
{owner:<value>, data:<value>},
{owner:<value>, data:<value>},
{owner:<value>, data:<value>}]
Upvotes: 2