ramd
ramd

Reputation: 347

Create a dictionary from list of dictionary with multiple repeated keys and to select max value from that list

I know there are many post related to dictionary operations but I could not find the solution for my special case. I have list of dictinoary (repeated dictionary keys with similar or different values) and I have to create a new dictionary from this list. Eg:

a = [{u'a': 1}, {u'a': 2}, {u'a': 1}, {u'b': 2}, {u'b': 1}, {u'c': 1}, {u'c': 1}]

Output I am looking for:

{'a': 2, 'b':2, 'c': 1}

So as you can see I just want one entry per key from the list and the value for that key would be max of all values. Hope its not too confusing. I have come with a working soultion but I just wanted to check if there is more pythonic answer to this (with less #of lines or better way) This is my working solution:

d = {}
for i in a:
    if not d.get(i.keys()[0]):
        d.update(i)
    elif d.get(i.keys()[0], 0) < i.values()[0]:
        d.update(i)
print d

Thansk for your time.

Upvotes: 2

Views: 96

Answers (4)

dawg
dawg

Reputation: 104062

You can sort the list a so that the like keys are groups and the largest values are last. Then add the values so that last value is the value left in the dict:

>>> a = [{u'a': 1}, {u'a': 2}, {u'a': 1}, {u'b': 2}, {u'b': 1}, {u'c': 1}, {u'c': 1}]
>>> {k:v for k,v in (x.items()[0] for x in sorted(a))}
{u'a': 2, u'c': 1, u'b': 2}

Or, alternate syntax:

>>> dict(x.items()[0] for x in sorted(a))

For Python 2 and 3 syntax:

>>> {k:v for k,v in (sorted(list(x.items())[0] for x in a))}
{'a': 2, 'b': 2, 'c': 1}
>>> dict(sorted(list(x.items())[0] for x in a))
{'a': 2, 'b': 2, 'c': 1}

From comments: what's happening here?

First, let's come up with a more instructive example:

>>> a = [{u'a': -1}, {u'a': -11}, {u'a': -3}, {u'b': 0}, {u'b': 100}, {u'c': 3}, {u'c': 1}]

So the desired result here is the keys (for Python 3 that maintain order in a dict or with OrderedDict) would be i) Keys in groups of sorted values and then ii) values interpreted as numerics in increasing values.

So try this first:

>>> sorted(list(x.items())[0] for x in a)
[('a', -11), ('a', -3), ('a', -1), ('b', 0), ('b', 100), ('c', 1), ('c', 3)]

Break it apart:

sorted(list(x.items())[0] for x in a)
       ^                            ^ comprehension of
                                 ^  a list of one element dicts
         ^       ^     ^            convert to a two element tuple
  ^                                 sort the tuple first by key, then by value

So that works by sorting the tuples first by the keys, then by the values.

Which leads to an alternate solution using groupby:

>>> from itertools import groupby
>>> for k,v in groupby(sorted(list(x.items())[0] for x in a), key=lambda t: t[0]):
...     print(k, max(v))
... 
a ('a', -1)
b ('b', 100)
c ('c', 3)

The groupby solution would be substantially more memory friendly since it does not create an extra list. The first solution, likely, will be faster with smaller list of dicts since the sorting is easier (but you would need to test that.)

It is not required in the solution that I gave that the keys be grouped (it is required for groupby to work). This works too:

 >>> sorted((list(x.items())[0] for x in a), key=lambda t: t[1])
 [('a', -11), ('a', -3), ('a', -1), ('b', 0), ('c', 1), ('c', 3), ('b', 100)]

Then turn it into a dict with the dict construction function. Recall that takes a list of tuples of (key, value):

>>> dict(sorted((list(x.items())[0] for x in a), key=lambda t: t[1]))
{'a': -1, 'b': 100, 'c': 3}

Upvotes: 2

Filip Młynarski
Filip Młynarski

Reputation: 3612

You could do by iterating over all of your dicts and updating final dict new_a with its content if given key isn't in new dict or its value is lower than original value.

a = [{u'a': 1}, {u'a': 2}, {u'a': 1}, {u'b': 2}, {u'b': 1}, {u'c': 1}, {u'c': 1}]
new_a = {}

for dict_ in a:
    key, value = list(dict_.items())[0]
    if key not in new_a or new_a[key] < value:
        new_a[key] = value

print(new_a) # -> {'c': 1, 'b': 2, 'a': 2}

Upvotes: 1

Tim
Tim

Reputation: 2843

You could use a defaultdict:

from collections import defaultdict

d = defaultdict(lambda: 0)
for val in a:
    if d[val.keys()[0]] < val.values()[0]:
        d[val.keys()[0]] = val.values()[0]

Output

{u'a': 2, u'b': 2, u'c': 1}

Upvotes: 1

Dani Mesejo
Dani Mesejo

Reputation: 61920

You could do:

a = [{u'a': 1}, {u'a': 2}, {u'a': 1}, {u'b': 2}, {u'b': 1}, {u'c': 1}, {u'c': 1}]

result = {}
for di in a:
    for key, value in di.items():
        result[key] = max(value, result.get(key, value))
print(result)

Output

{'a': 2, 'c': 1, 'b': 2}

Upvotes: 1

Related Questions