Reputation: 33
I have a project where I have to open json1 file (with multiple key values) and compare the desired fields to what is in the json2 file and If json1 has data where json2 is incorrect or blank, create an output JSON file . I know how to work with dictionaries (json files). But I am having hard time trying to wrap my head around how to go about completing this project.
Example data of json files. Im dealing with over 3k hosts
json1 ={
"host.com": {
"parent": "undefined",
"sys_os": "redhad",
"zone": "qa ",
"jdk": "undefined",
"appName": "someapp",
"admin": "undefined",
"ip": "127.0.0.1",
"notes": "undefined",
}
}
json2 ={
"host.com": {
"parent": "undefined",
"sys_os": "redhad",
"zone": "dev",
"jdk": "undefined",
"appName": "someapp",
"admin": "",
"ip": "168.192.1.1",
"notes": "new to python",
}
}
this is some code that I found, but am not sure if this is the correct path for this project.
data1 = json.load(open(json1))
data2 = json.load(open(json2))
The files that Im dealing with are dicts, but I didn't do a good job in trying to add that info this question. sorry if my format is not correct.
print type(data1)
<type 'dict'>
for key in set(data.keys()).union(data.keys()):
if key not in data1:
print "json1 doesn't contain", key
elif key not in data2:
print "json2 doesn't contain", key
elif data1[key] == data2[key]:
print "match", key
else:
print "don't match", key
I'm still trying to under data structures.
I want my output to be the updated values from json1 to json2.
json3 ={
"host.com": {
"parent": "undefined",
"sys_os": "redhad",
"zone": "qa ",
"jdk": "undefined",
"appName": "someapp",
"admin": "undefined",
"ip": "127.0.0.1",
"notes": "undefined",
}
}
Upvotes: 0
Views: 67
Reputation: 119
Understanding/Assumptions based on inferring provided data.
Host data is in a json file and each host is a disctionary having several key/value pairs. problem: data migration was done but during migrations
certain keys were missed.
keys we migrated but values are not orrect or left balnk.
missed migration of some hosts.
Solution:
... more comments inside code
def fix_migrated_data(orig_data_locn, migrated_data_locn, result_locn):
"""
compares the original data and migrated data json files and
produces a result json files which has all migration errors fixed
"""
orig_data = json.loads(open(orig_data_locn).read())
migrated_data = json.loads(open(migrated_data_locn).read())
for orig_host, orig_host_data in orig_data.iteritems():
# check if the host in original data is availale in migrated data
migrated_host_dict = migrated_data.get(orig_host, None)
# if host available in migrated data
if migrated_host_dict:
print 'host {} found in migrated data'.format(orig_host)
# compare original and migrated key/values
for orig_key, orig_value in orig_host_data.iteritems():
# check if key is available in migrated host dict
migrated_value = migrated_host_dict.get(orig_key, None)
# if key is available in migrated host dict
# check if values are same in orig, migrated if not update migrated data
if migrated_value:
if migrated_value == orig_value:
print "orginal and migrated values are identical"
else:
print "original and migrated values are not identical"
migrated_host_dict[orig_key] = orig_value
else:
migrated_host_dict[orig_key] = orig_value
else:
# if host not availalble in migrated data then add host from original data to migrated data
print 'host {} not found in migrated data'.format(orig_host)
migrated_data[orig_host] = orig_host_data
# finally write the fixed migrated data to a file
with open(result_locn, 'w') as fp:
json.dump(migrated_data, fp)
return
orig_data_locn = 'json1.json'
migrated_data_locn = 'json2.json'
result_locn = 'json3.json'
fix_migrated_data(orig_data_locn, migrated_data_locn, result_locn)
json1.json:
{
"host1.com": {
"parent": "undefined",
"sys_os": "redhad",
"zone": "qa ",
"jdk": "undefined",
"appName": "someapp",
"admin": "undefined",
"ip": "127.0.0.1",
"notes": "undefined"
},
"host2.com": {
"parent": "host2_parent",
"sys_os": "redhad",
"zone": "qa ",
"jdk": "undefined",
"appName": "someapp",
"admin": "undefined",
"ip": "127.0.0.1",
"notes": "undefined"
},
"host3.com": {
"parent": "host3_parent",
"sys_os": "redhad",
"zone": "qa ",
"jdk": "undefined",
"appName": "someapp",
"admin": "undefined",
"ip": "127.0.0.1",
"notes": "undefined"
}
}
json2.json:
{
"host1.com": {
"parent": "undefined",
"sys_os": "redhad",
"zone": "qa ",
"appName": "someapp",
"admin": "",
"ip": "",
"notes": "undefined",
"extra_key1": "this is extra key1 in json2",
"extra_key2": "this is extra key2 in json2"
},
"host2.com": {
"parent": "undefined",
"sys_os": "redhad",
"zone": "qa ",
"appName": "nothing",
"admin": "undefined",
"ip": "1.2.3.4",
"notes": "undefined",
"extra_key1": "this is extra key1 in json2",
"extra_key2": "this is extra key2 in json2"
}
}
The output that was produced (json3.json):
{
"host2.com": {
"sys_os": "redhad",
"parent": "host2_parent",
"zone": "qa ",
"appName": "someapp",
"admin": "undefined",
"ip": "127.0.0.1",
"notes": "undefined",
"jdk": "undefined",
"extra_key1": "this is extra key1 in json2",
"extra_key2": "this is extra key2 in json2"
},
"host1.com": {
"sys_os": "redhad",
"parent": "undefined",
"zone": "qa ",
"appName": "someapp",
"admin": "undefined",
"ip": "127.0.0.1",
"notes": "undefined",
"jdk": "undefined",
"extra_key1": "this is extra key1 in json2",
"extra_key2": "this is extra key2 in json2"
},
"host3.com": {
"parent": "host3_parent",
"sys_os": "redhad",
"zone": "qa ",
"jdk": "undefined",
"appName": "someapp",
"admin": "undefined",
"ip": "127.0.0.1",
"notes": "undefined"
}
}
Upvotes: 0