Reputation: 2012
I have a txt file that contains some plain text and a json styled text block. I want to parse the txt and extract the json block to a python dict object.
For example, the txt file may looks like:
1234567
asdfjkl
{
"Name": {
"given ": "kevin"
},
"info": [
"asdf",
"fda",
"sdf"
]
}
and there's one and only one legit json block in each txt. Couldn't find anything in the json
package. Any help will be appreciated.
Upvotes: 2
Views: 1296
Reputation: 710
If you're in control of the input, don't do it this way. If the gal or guy on the input end is the client/boss, curse the code gods under your breath and write a helper function like this:
# returns a list of plaintext lines and a json string.
def split_text_and_json(filename):
textlines = []
jsonlines = []
bracketcount = 0
with open(filename) as f:
for line in f.readlines():
bracketcount += line.count('{')
if bracketcount:
jsonlines.append(line)
else:
textlines.append(line)
bracketcount-=line.count('}')
return (textlines, ''.join(jsonlines))
plaintextpart, jsonpart = split_text_and_json('file.txt')
You could do this inline if the JSON was always guaranteed to be the last part of the file.
Now for the bad stuff: You need to handle files that include the { } characters outside of the json. This script will throw errors in that case when you try to load the json.
Upvotes: 0
Reputation: 71610
As you said, JSON always in the back:
from ast import literal_eval
with open('filename.txt','r') as f:
s=f.read()
print(literal_eval(s[s.index('{')-1:]))
Better using json.loads
:
from json import loads
with open('filename.txt','r') as f:
s=f.read()
print(loads(s[s.index('{'):]))
Both output:
{'Name': {'given ': 'kevin'}, 'info': ['asdf', 'fda', 'sdf']}
Upvotes: 3