Robin
Robin

Reputation: 69

Merge dictionaries with same key from two lists of dicts in python

I have two dictionaries, as below. Both dictionaries have a list of dictionaries as the value associated with their properties key; each dictionary within these lists has an id key. I wish to merge my two dictionaries into one such that the properties list in the resulting dictionary only has one dictionary for each id.

{
   "name":"harry",
   "properties":[
      {
         "id":"N3",
         "status":"OPEN",
         "type":"energetic"
      },
      {
         "id":"N5",
         "status":"OPEN",
         "type":"hot"
      }
   ]
}

and the other list:

{
   "name":"harry",
   "properties":[
      {
         "id":"N3",
         "type":"energetic",
         "language": "english"
      },
      {
         "id":"N6",
         "status":"OPEN",
         "type":"cool"
      }
   ]
}

The output I am trying to achieve is:

   "name":"harry",
   "properties":[
      {
         "id":"N3",
         "status":"OPEN",
         "type":"energetic",
         "language": "english"
      },
      {
         "id":"N5",
         "status":"OPEN",
         "type":"hot"
      },
      {
         "id":"N6",
         "status":"OPEN",
         "type":"cool"
      }
   ]
}

As id: N3 is common in both the lists, those 2 dicts should be merged with all the fields. So far I have tried using itertools and

ds = [d1, d2]
d = {}
for k in d1.keys():
  d[k] = tuple(d[k] for d in ds)

Could someone please help in figuring this out?

Upvotes: 0

Views: 1553

Answers (2)

Alex Reynolds
Alex Reynolds

Reputation: 96927

It might help to treat the two objects as elements each in their own lists. Maybe you have other objects with different name values, such as might come out of a JSON-formatted REST request.

Then you could do a left outer join on both name and id keys:

#!/usr/bin/env python

a = [
    {
        "name": "harry",
        "properties": [
            {
                "id":"N3",
                "status":"OPEN",
                "type":"energetic"
            },
            {
                "id":"N5",
                "status":"OPEN",
                "type":"hot"
            }
        ]
    }
]

b = [
    {
        "name": "harry",
        "properties": [
            {
                "id":"N3",
                "type":"energetic",
                "language": "english"
            },
            {
                "id":"N6",
                "status":"OPEN",
                "type":"cool"
            }
        ]
    }
]

a_names = set()
a_prop_ids_by_name = {}
a_by_name = {}
for ao in a:
    an = ao['name']
    a_names.add(an)
    if an not in a_prop_ids_by_name:
        a_prop_ids_by_name[an] = set()
    for ap in ao['properties']:
        api = ap['id']
        a_prop_ids_by_name[an].add(api)
    a_by_name[an] = ao

res = []

for bo in b:
    bn = bo['name']
    if bn not in a_names:
        res.append(bo)
    else:
        ao = a_by_name[bn]
        bp = bo['properties']
        for bpo in bp:
             if bpo['id'] not in a_prop_ids_by_name[bn]:
                 ao['properties'].append(bpo)
        res.append(ao)

print(res)

The idea above is to process list a for names and ids. The names and ids-by-name are instances of a Python set. So members are always unique.

Once you have these sets, you can do the left outer join on the contents of list b.

Either there's an object in b that doesn't exist in a (i.e. shares a common name), in which case you add that object to the result as-is. But if there is an object in b that does exist in a (which shares a common name), then you iterate over that object's id values and look for ids not already in the a ids-by-name set. You add missing properties to a, and then add that processed object to the result.

Output:

[{'name': 'harry', 'properties': [{'id': 'N3', 'status': 'OPEN', 'type': 'energetic'}, {'id': 'N5', 'status': 'OPEN', 'type': 'hot'}, {'id': 'N6', 'status': 'OPEN', 'type': 'cool'}]}]

This doesn't do any error checking on input. This relies on name values being unique per object. So if you have duplicate keys in objects in both lists, you may get garbage (incorrect or unexpected output).

Upvotes: 1

Bhagyesh Dudhediya
Bhagyesh Dudhediya

Reputation: 1856

Here is one of the approach:

a = {
   "name":"harry",
   "properties":[
      {
         "id":"N3",
         "status":"OPEN",
         "type":"energetic"
      },
      {
         "id":"N5",
         "status":"OPEN",
         "type":"hot"
      }
   ]
}
b = {
   "name":"harry",
   "properties":[
      {
         "id":"N3",
         "type":"energetic",
         "language": "english"
      },
      {
         "id":"N6",
         "status":"OPEN",
         "type":"cool"
      }
   ]
}

# Create dic maintaining the index of each id in resp dict
a_ids = {item['id']: index for index,item in enumerate(a['properties'])} #{'N3': 0, 'N5': 1}
b_ids = {item['id']: index for index,item in enumerate(b['properties'])} #{'N3': 0, 'N6': 1}

# Loop through one of the dict created
for id in a_ids.keys():
    # If same ID exists in another dict, update it with the key value
    if id in b_ids:
        b['properties'][b_ids[id]].update(a['properties'][a_ids[id]])
    # If it does not exist, then just append the new dict
    else:
        b['properties'].append(a['properties'][a_ids[id]])
        
        
print (b)

Output:

{'name': 'harry', 'properties': [{'id': 'N3', 'type': 'energetic', 'language': 'english', 'status': 'OPEN'}, {'id': 'N6', 'status': 'OPEN', 'type': 'cool'}, {'id': 'N5', 'status': 'OPEN', 'type': 'hot'}]}

Upvotes: 1

Related Questions