Nasif Imtiaz Ohi
Nasif Imtiaz Ohi

Reputation: 1713

How to read text file full of json objects separated with space in python

I have a text file that have json objects in below format:

{owner:<value>, data:<value>}
{owner:<value>, data:<value>}
{owner:<value>, data:<value>}
{owner:<value>, data:<value>}
{owner:<value>, data:<value>}

Note that they are separated only by space. However, I know thay are all valid json objects containing the same keys.

How can I read this file in python and converting the file to a valid json file?

Note that the file is pretty big so I would need to read by streams.

Upvotes: 0

Views: 1027

Answers (2)

damon
damon

Reputation: 15128

One way to do this would be to read, process, and write each line of the output file as you go along.

with open("input.data") as infile, open("output.json", "w") as outfile:
    first_line = True
    outfile.write("[")

    for line in infile:
        data = ""
        if first_line:
            first_line = False
        else:
            data += ","
        data += "\n  " + line.strip()
        outfile.write(data)

    outfile.write("\n]")

As an example, I tested this method in the python shell:

>>> def write_lines_to_json(input_lines, indent="  "):
        yield "["

        first_line = True
        for line in input_lines:
            data = ""
            if first_line:
                first_line = False
            else:
                data += ","

            data += "\n" + indent + line.strip()
            yield data

        yield "\n]"

# This is the contents of the input file
>>> with open("/tmp/input.data") as fobj:
        print(fobj.read())
{"owner": "a", "data": 0}
{"owner": "b", "data": 1}
{"owner": "c", "data": 2}
{"owner": "d", "data": 3}
{"owner": "e", "data": 4}
{"owner": "f", "data": 5}
{"owner": "g", "data": 6}
{"owner": "h", "data": 7}
{"owner": "i", "data": 8}
{"owner": "j", "data": 9}

>>> with open("/tmp/input.data") as infile, open("/tmp/output.json", "w") as outfile:
        for data in write_lines_to_json(infile):
            outfile.write(data)

# This is the contents of the output json file
>>> with open("/tmp/output.json") as fobj:
        print(fobj.read())
[
  {"owner": "a", "data": 0},
  {"owner": "b", "data": 1},
  {"owner": "c", "data": 2},
  {"owner": "d", "data": 3},
  {"owner": "e", "data": 4},
  {"owner": "f", "data": 5},
  {"owner": "g", "data": 6},
  {"owner": "h", "data": 7},
  {"owner": "i", "data": 8},
  {"owner": "j", "data": 9}
]

Upvotes: 1

Wonka
Wonka

Reputation: 1886

Try this, it will get a list with each json line:

import json
d = []

with open("file.txt") as f:
    lines = f.readlines()
    for line in lines:
        d.append(json.loads(line))

OUTPUT (should be)

d = [{owner:<value>, data:<value>},
     {owner:<value>, data:<value>},
     {owner:<value>, data:<value>},
     {owner:<value>, data:<value>}]

Upvotes: 2

Related Questions