milo
milo

Reputation: 33

If json1 has data where json2 is incorrect or blank, create an output JSON file with the updated data

I have a project where I have to open json1 file (with multiple key values) and compare the desired fields to what is in the json2 file and If json1 has data where json2 is incorrect or blank, create an output JSON file . I know how to work with dictionaries (json files). But I am having hard time trying to wrap my head around how to go about completing this project.

Example data of json files. Im dealing with over 3k hosts

json1 ={
    "host.com": {
        "parent": "undefined", 
        "sys_os": "redhad", 
        "zone": "qa ", 
        "jdk": "undefined", 
        "appName": "someapp", 
        "admin": "undefined", 
        "ip": "127.0.0.1", 
        "notes": "undefined", 

  }

}

json2 ={
    "host.com": {
        "parent": "undefined", 
        "sys_os": "redhad", 
        "zone": "dev", 
        "jdk": "undefined", 
        "appName": "someapp", 
        "admin": "", 
        "ip": "168.192.1.1", 
        "notes": "new to python", 

  }

}

this is some code that I found, but am not sure if this is the correct path for this project.

data1 = json.load(open(json1))
data2 = json.load(open(json2))

The files that Im dealing with are dicts, but I didn't do a good job in trying to add that info this question. sorry if my format is not correct.

print type(data1)
<type 'dict'>

for key in set(data.keys()).union(data.keys()):
  if key not in data1:
    print "json1 doesn't contain", key
  elif key not in data2:
    print "json2 doesn't contain", key
  elif data1[key] == data2[key]:
    print "match", key
  else:
    print "don't match", key

I'm still trying to under data structures.

I want my output to be the updated values from json1 to json2.

json3 ={ "host.com": { "parent": "undefined", "sys_os": "redhad", "zone": "qa ", "jdk": "undefined", "appName": "someapp", "admin": "undefined", "ip": "127.0.0.1", "notes": "undefined",
} }

Upvotes: 0

Views: 67

Answers (1)

satyakrish
satyakrish

Reputation: 119

Understanding/Assumptions based on inferring provided data.

Host data is in a json file and each host is a disctionary having several key/value pairs. problem: data migration was done but during migrations

  1. certain keys were missed.

  2. keys we migrated but values are not orrect or left balnk.

  3. missed migration of some hosts.

Solution:

  1. orig_data_locn was the original json data that was migrated.
  2. migrated_data_locn is the migrated json data.
  3. result locn will the locn where the fixed migrated data will be written to.
  4. iterate over each host in json1_datagrab the corresponding host from json2_data, if host not available in json_2_data update json_2_data with the host details from json_1_data.
  5. if host is availale in both orig and migrated data then iterate over each key/value in orig data and check if key value is available in migrated_data
  6. if key found in both orig and migrated data and equal -> no operation
  7. if key found in both original and migrated but not equal -> update migrated key/value with orig key/value
  8. if key not found in migrated then add orig key/value to migrated

... more comments inside code

def fix_migrated_data(orig_data_locn, migrated_data_locn, result_locn):
    """
    compares the original data and migrated data json files and 
    produces a result json files which has all migration errors fixed
    """
    orig_data = json.loads(open(orig_data_locn).read())
    migrated_data = json.loads(open(migrated_data_locn).read())


    for orig_host, orig_host_data in orig_data.iteritems():    
        # check if the host in original data is availale in migrated data
        migrated_host_dict = migrated_data.get(orig_host, None)

        # if host available in migrated data
        if migrated_host_dict:            
            print 'host {} found in migrated data'.format(orig_host)

            # compare original and migrated key/values    
            for orig_key, orig_value in orig_host_data.iteritems():
                # check if key is available in migrated host dict
                migrated_value = migrated_host_dict.get(orig_key, None)

                # if key is available in migrated host dict 
                # check if values are same in orig, migrated if not update migrated data                
                if migrated_value:
                    if migrated_value == orig_value:
                        print "orginal and migrated values are identical"
                    else:
                        print "original and migrated values are not identical"
                        migrated_host_dict[orig_key] = orig_value            
                else:
                    migrated_host_dict[orig_key] = orig_value
        else: 
            # if host not availalble in migrated data then add host from original data to migrated data
            print 'host {} not found in migrated data'.format(orig_host)
            migrated_data[orig_host] = orig_host_data


    # finally write the fixed migrated data to a file
    with open(result_locn, 'w') as fp:
        json.dump(migrated_data, fp)     
    return


orig_data_locn = 'json1.json'
migrated_data_locn = 'json2.json'
result_locn = 'json3.json'

fix_migrated_data(orig_data_locn, migrated_data_locn, result_locn)

json1.json:

{
    "host1.com": {
        "parent": "undefined", 
        "sys_os": "redhad", 
        "zone": "qa ", 
        "jdk": "undefined", 
        "appName": "someapp", 
        "admin": "undefined", 
        "ip": "127.0.0.1", 
        "notes": "undefined"
                },

    "host2.com": {
        "parent": "host2_parent", 
        "sys_os": "redhad", 
        "zone": "qa ", 
        "jdk": "undefined", 
        "appName": "someapp", 
        "admin": "undefined", 
        "ip": "127.0.0.1", 
        "notes": "undefined"
                },

    "host3.com": {
        "parent": "host3_parent", 
        "sys_os": "redhad", 
        "zone": "qa ", 
        "jdk": "undefined", 
        "appName": "someapp", 
        "admin": "undefined", 
        "ip": "127.0.0.1", 
        "notes": "undefined"
                }     


 }

json2.json:

{
    "host1.com": {
        "parent": "undefined", 
        "sys_os": "redhad", 
        "zone": "qa ", 
        "appName": "someapp", 
        "admin": "", 
        "ip": "", 
        "notes": "undefined",
        "extra_key1": "this is extra key1 in json2",
        "extra_key2": "this is extra key2 in json2"

                },

    "host2.com": {
        "parent": "undefined", 
        "sys_os": "redhad", 
        "zone": "qa ", 
        "appName": "nothing", 
        "admin": "undefined", 
        "ip": "1.2.3.4", 
        "notes": "undefined",
        "extra_key1": "this is extra key1 in json2",
        "extra_key2": "this is extra key2 in json2"

                }


 }

The output that was produced (json3.json):

{
    "host2.com": {
        "sys_os": "redhad",
        "parent": "host2_parent",
        "zone": "qa ",
        "appName": "someapp",
        "admin": "undefined",
        "ip": "127.0.0.1",
        "notes": "undefined",
        "jdk": "undefined",
        "extra_key1": "this is extra key1 in json2",
        "extra_key2": "this is extra key2 in json2"
    },
    "host1.com": {
        "sys_os": "redhad",
        "parent": "undefined",
        "zone": "qa ",
        "appName": "someapp",
        "admin": "undefined",
        "ip": "127.0.0.1",
        "notes": "undefined",
        "jdk": "undefined",
        "extra_key1": "this is extra key1 in json2",
        "extra_key2": "this is extra key2 in json2"
    },
    "host3.com": {
        "parent": "host3_parent",
        "sys_os": "redhad",
        "zone": "qa ",
        "jdk": "undefined",
        "appName": "someapp",
        "admin": "undefined",
        "ip": "127.0.0.1",
        "notes": "undefined"
    }
}

Upvotes: 0

Related Questions