el347
el347

Reputation: 87

faster and more 'pythonic' list of dictionaries

For simplicity, I've provided 2 lists in a list, but I'm actually dealing with a hundred of lists in a list, each containing a sizable amount of dictionaries. I only want to get the value of 'status' key in the 1st dictionary without checking any other dictionaries in that list (since I know they all contain the same value at that key). Then I will perform some sort of clustering within each big dictionary. I need to efficiently concatenate all 'title' values. Is there a way to make my code more elegant and much faster?

I have:

nested = [
    [
        {'id': 287, 'title': 'hungry badger',  'status': 'High'},
        {'id': 437, 'title': 'roadtrip to Kansas','status': 'High'}
    ],
    [
        {'id': 456, 'title': 'happy title here','status': 'Medium'},
        {'id': 342,'title': 'soft big bear','status': 'Medium'}
    ]
]

I'd like:

result = [
    {
        'High': [
            {'id': 287, 'title': 'hungry badger'},
            {'id': 437, 'title': 'roadtrip to Kansas'}
        ]
    },
    {
        'Medium': [
            {'id': 456, 'title': 'happy title here'},
            {'id': 342, 'title': 'soft big bear'}
        ]
    }
]

What I tried:

for oneList in nested: 
   result= {}
   for i in oneList:        
       a= list(i.keys()) 
       m= [i[key] for key in a if key not in ['id','title']]
       result[m[0]]=oneList
       for key in a:
            if key not in ['id','title']:
                del i[key]

Upvotes: 3

Views: 48

Answers (2)

TigerhawkT3
TigerhawkT3

Reputation: 49330

You could make a defaultdict for each nested list:

import collections
nested = [
[{'id': 287, 'title': 'hungry badger',  'status': 'High'},
{'id': 437, 'title': 'roadtrip to Kansas','status': 'High'}],     
[{'id': 456, 'title': 'happy title here','status': 'Medium'},
{'id': 342,'title': 'soft big bear','status': 'Medium'}]   ]
result = []
for l in nested:
    r = collections.defaultdict(list)
    for d in l:
        name = d.pop('status')
        r[name].append(d)
    result.append(r)

This gives the following result:

>>> import pprint
>>> pprint.pprint(result)
[{'High': [{'id': 287, 'title': 'hungry badger'},
           {'id': 437, 'title': 'roadtrip to Kansas'}]},
 {'Medium': [{'id': 456, 'title': 'happy title here'},
             {'id': 342, 'title': 'soft big bear'}]}]

Upvotes: 2

gnicholas
gnicholas

Reputation: 2087

from itertools import groupby    
result = groupby(sum(nested,[]), lambda x: x['status'])

How it works:

sum(nested,[]) concatenates all your outer lists together into one big list of dictionaries

groupby(, lambda x: x['status']) groups all your objects by their status property

Note itertools.groupby returns a generator (not a list), so if you want to materialize the generator you need to do something like follows.

from itertools import groupby    
result = groupby(sum(nested,[]), lambda x: x['status'])
result = {key:list(val) for key,val in result}

Upvotes: 2

Related Questions