sevenfold
sevenfold

Reputation: 101

PYTHON - group a list of dict

Is there an easy way In Python3 to group a list of dict by keys I have a complicated input list that I want to format

my input object is like that:

my_input = [
  {
    'name': 'nameA',
    'departments': [
      {
        'name': 'dep1',
        'details': [
          {
            'name': 'name_detA',
            'tech_name': 'techNameA',
            'others': None,
            'sub_details': []
          },
          {
            'name': 'name_detB',
            'tech_name': 'techNameB',
            'others': 22,
            'sub_details': [
             {
                'id': 'idB',
                'column2': 'ZZ',
                'column3': 'CCC',
                'column4': {
                  'id': 'id2',
                  'subColumn1': 'HHH',
                  'subColumn1': 'PPPP',
                  'subColumn1': 'FFFFFF'
                }
              }
            ]
          },
          
          
          {
            'name': 'name_detB',
            'tech_name': 'techNameB',
            'others': 22,
            'sub_details': [
              {
                'id': 'idA',
                'column2': 'AA',
                'column3': 'BBB',
                'column4': {
                  'id': 'id1',
                  'subColumn1': 'XXXX',
                  'subColumn1': 'YYYYY',
                  'subColumn1': 'DDDDDD'
                }
              }
            ]
          }
          
        ]
      }
    ]
  }
]

my goal is to group elements having the same details['techName'] into One element and merge their sub_details

expected output :

my_output = [
  {
    "name": "nameA",
    "departments": [
      {
        "name": "dep1",
        "details": [
          {
            "name": "name_detA",
            "tech_name": "techNameA",
            "others": None,
            "sub_details": []
          },
          {
            "name": "name_detB",
            "tech_name": "techNameB",
            "others": 22,
            "sub_details": [
             {
                "id": "idB",
                "column2": "ZZ",
                "column3": "CCC",
                "column4": {
                  "id": "id2",
                  "subColumn1": "HHH",
                  "subColumn1": "PPPP",
                  "subColumn1": "FFFFFF"
                }
              },
              {
                "id": "idA",
                "column2": "AA",
                "column3": "BBB",
                "column4": {
                  "id": "id1",
                  "subColumn1": "XXXX",
                  "subColumn1": "YYYYY",
                  "subColumn1": "DDDDDD"
                }
              }
            ]
          }
        ]
      }
    ]
  }
]

I tried that :

result_list = []
sub = []
for elem in my_input:
    for data in elem["departments"]:
        for sub_detail, dicts_for_that_sub in itertools.groupby(data["details"], key=operator.itemgetter("sub_details")):
            sub.append({"sub_details": sub_detail})
        print(sub)

But I'm struggling to create back the new output

Upvotes: 0

Views: 91

Answers (1)

JonSG
JonSG

Reputation: 13067

Assuming that the input I used here is what you really wanted, then you are on the right track. I re-implemented your innermost for loop as a call to a method, but that is not strictly needed.

I would probably take a slight different approach to the merge_details() method using setdefault() rather than the if/else but this way is easier to follow if you have not used setdefault() before.

The import json is just so the print does something "nice" and is not required as part of the solution.

import json

my_input = [
  {
    "name": "nameA",
    "departments": [
      {
        "name": "dep1",
        "details": [
          {
            "name": "name_detB",
            "tech_name": "techNameB",
            "others": 22,
            "sub_details": [
             {
                "id": "idB",
                "column2": "ZZ",
                "column3": "CCC",
                "column4": {
                  "id": "id2",
                  "subColumn1": "HHH",
                  "subColumn2": "PPPP",
                  "subColumn3": "FFFFFF"
                }
              }
            ]
          },
          {
            "name": "name_detA",
            "tech_name": "techNameA",
            "others": None,
            "sub_details": []
          },
          {
            "name": "name_detB",
            "tech_name": "techNameB",
            "others": 22,
            "sub_details": [
              {
                "id": "idA",
                "column2": "AA",
                "column3": "BBB",
                "column4": {
                  "id": "id1",
                  "subColumn1": "XXXX",
                  "subColumn2": "YYYYY",
                  "subColumn3": "DDDDDD"
                }
              }
            ]
          }
        ]
      }
    ]
  }
]

def merge_details(details):
    ## --------------------
    ## dict to hold details by key (tech_name)
    keyed_details = {}
    ## --------------------

    ## --------------------
    ## for each each "detail" if we find it in the key_detail merge the
    ## sub_details lists otherwise add it as the value of the key
    ## --------------------
    for detail in details:
        key = detail["tech_name"]
        if keyed_details.get(key):
            keyed_details[key]["sub_details"].extend(detail["sub_details"])
        else:
            keyed_details[key] = detail
    ## --------------------

    return list(keyed_details.values())

for elem in my_input:
    for department in elem["departments"]:
        department["details"] = merge_details(department["details"])

print(json.dumps(my_input, indent=4, sort_keys=True))

Upvotes: 1

Related Questions