Kevin Fang
Kevin Fang

Reputation: 2012

How to parse json formatted text at the end of a txt file

I have a txt file that contains some plain text and a json styled text block. I want to parse the txt and extract the json block to a python dict object.

For example, the txt file may looks like:

1234567
asdfjkl
{
  "Name": {
    "given ": "kevin"
  },
  "info": [
    "asdf",
    "fda",
    "sdf"
  ]
}

and there's one and only one legit json block in each txt. Couldn't find anything in the json package. Any help will be appreciated.

Upvotes: 2

Views: 1296

Answers (2)

Alex Weavers
Alex Weavers

Reputation: 710

If you're in control of the input, don't do it this way. If the gal or guy on the input end is the client/boss, curse the code gods under your breath and write a helper function like this:

# returns a list of plaintext lines and a json string.
def split_text_and_json(filename):
    textlines = []
    jsonlines = []
    bracketcount = 0

    with open(filename) as f:
       for line in f.readlines():
           bracketcount += line.count('{')

           if bracketcount:
               jsonlines.append(line)
           else:
               textlines.append(line)
           bracketcount-=line.count('}')
    return (textlines, ''.join(jsonlines))

plaintextpart, jsonpart = split_text_and_json('file.txt')

You could do this inline if the JSON was always guaranteed to be the last part of the file.

Now for the bad stuff: You need to handle files that include the { } characters outside of the json. This script will throw errors in that case when you try to load the json.

Upvotes: 0

U13-Forward
U13-Forward

Reputation: 71610

As you said, JSON always in the back:

from ast import literal_eval
with open('filename.txt','r') as f:
   s=f.read()
   print(literal_eval(s[s.index('{')-1:]))

Better using json.loads:

from json import loads
with open('filename.txt','r') as f:
   s=f.read()
   print(loads(s[s.index('{'):]))

Both output:

{'Name': {'given ': 'kevin'}, 'info': ['asdf', 'fda', 'sdf']}

Upvotes: 3

Related Questions