Reputation: 14990
I have a set of JSON files that contains some information.The below data is value for key 'BrowserInfo'.I want to extract the following information
Title
, Links
, Browser
,Platform
,CPUs
from what is given below, add the above fields as keys in the JSON file and extract their values and assign to those keys.
Title: Worlds best websit | mywebsite.com
Links: 225
Browser: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 Ubuntu Chromium/41.0.2272.76 Chrome/41.0.2272.76 Safari/537.36
Platform: Linux x86_64
CPUs: 8
I have writtten a python program to descent into the directory and extract 'BrowserInfo' value from the JSON files.
# Set the directory you want to start from
rootDir = '/home/space'
for dirName, subdirList, fileList in os.walk(rootDir):
print('Found directory: %s' % dirName)
for fname in fileList:
fname='space/'+fname
with open(fname, 'r+') as f:
json_data = json.load(f)
BrowserInfo = json_data['BrowserInfo']
print(BrowserInfo)
How do I extract the values and add new key-value pairs to JSON files using Python.
Upvotes: 0
Views: 2339
Reputation: 662
A quick demo for the parsing
>>>import re, itertools
>>> BrowserInfo
'Title: Worlds best websit | mywebsite.com\nLinks: 225\nBrowser: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 Ubuntu Chromium/41.0.2272.76 Chrome/41.0.2272.76 Safari/537.36\nPlatform: Linux x86_64\nCPUs: 8'
>>> re.split(':|\n', BrowserInfo)
['Title', ' Worlds best websit | mywebsite.com', 'Links', ' 225', 'Browser', ' Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 Ubuntu Chromium/41.0.2272.76 Chrome/41.0.2272.76 Safari/537.36', 'Platform', ' Linux x86_64', 'CPUs', ' 8']
>>> s = re.split(':|\n', BrowserInfo)
>>> {pair[0].strip():pair[1].strip() for pair in itertools.izip(s[::2], s[1::2])}
{'Platform': 'Linux x86_64', 'Browser': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 Ubuntu Chromium/41.0.2272.76 Chrome/41.0.2272.76 Safari/537.36', 'CPUs': '8', 'Links': '225', 'Title': 'Worlds best websit | mywebsite.com'}
Thus
json_data['BrowserInfo'] = {pair[0].strip():pair[1].strip() for pair in itertools.izip(s[::2], s[1::2])}
would be your json
Upvotes: 0
Reputation: 30210
Assuming, (and this seems like a big assumption), that BrowserInfo
contains newline-separated key, value pairs, separated by ': '
, you could extract the keys / values with:
for line in BrowserInfo.splitlines():
k,v = line.split(': ', 1)
Then just insert them wherever you want in the dictionary, e.g.:
json_data['BrowserInfo'] = {}
for line in BrowserInfo.splitlines():
k,v = line.split(': ', 1)
json_data['BrowserInfo'][k] = v
Upvotes: 1