Reputation: 347
I know there are many post related to dictionary operations but I could not find the solution for my special case. I have list of dictinoary (repeated dictionary keys with similar or different values) and I have to create a new dictionary from this list. Eg:
a = [{u'a': 1}, {u'a': 2}, {u'a': 1}, {u'b': 2}, {u'b': 1}, {u'c': 1}, {u'c': 1}]
Output I am looking for:
{'a': 2, 'b':2, 'c': 1}
So as you can see I just want one entry per key from the list and the value for that key would be max of all values. Hope its not too confusing. I have come with a working soultion but I just wanted to check if there is more pythonic answer to this (with less #of lines or better way) This is my working solution:
d = {}
for i in a:
if not d.get(i.keys()[0]):
d.update(i)
elif d.get(i.keys()[0], 0) < i.values()[0]:
d.update(i)
print d
Thansk for your time.
Upvotes: 2
Views: 96
Reputation: 104062
You can sort the list a
so that the like keys are groups and the largest values are last. Then add the values so that last value is the value left in the dict:
>>> a = [{u'a': 1}, {u'a': 2}, {u'a': 1}, {u'b': 2}, {u'b': 1}, {u'c': 1}, {u'c': 1}]
>>> {k:v for k,v in (x.items()[0] for x in sorted(a))}
{u'a': 2, u'c': 1, u'b': 2}
Or, alternate syntax:
>>> dict(x.items()[0] for x in sorted(a))
For Python 2 and 3 syntax:
>>> {k:v for k,v in (sorted(list(x.items())[0] for x in a))}
{'a': 2, 'b': 2, 'c': 1}
>>> dict(sorted(list(x.items())[0] for x in a))
{'a': 2, 'b': 2, 'c': 1}
From comments: what's happening here?
First, let's come up with a more instructive example:
>>> a = [{u'a': -1}, {u'a': -11}, {u'a': -3}, {u'b': 0}, {u'b': 100}, {u'c': 3}, {u'c': 1}]
So the desired result here is the keys (for Python 3 that maintain order in a dict or with OrderedDict) would be i) Keys in groups of sorted values and then ii) values interpreted as numerics in increasing values.
So try this first:
>>> sorted(list(x.items())[0] for x in a)
[('a', -11), ('a', -3), ('a', -1), ('b', 0), ('b', 100), ('c', 1), ('c', 3)]
Break it apart:
sorted(list(x.items())[0] for x in a)
^ ^ comprehension of
^ a list of one element dicts
^ ^ ^ convert to a two element tuple
^ sort the tuple first by key, then by value
So that works by sorting the tuples first by the keys, then by the values.
Which leads to an alternate solution using groupby
:
>>> from itertools import groupby
>>> for k,v in groupby(sorted(list(x.items())[0] for x in a), key=lambda t: t[0]):
... print(k, max(v))
...
a ('a', -1)
b ('b', 100)
c ('c', 3)
The groupby
solution would be substantially more memory friendly since it does not create an extra list. The first solution, likely, will be faster with smaller list of dicts since the sorting is easier (but you would need to test that.)
It is not required in the solution that I gave that the keys be grouped (it is required for groupby
to work). This works too:
>>> sorted((list(x.items())[0] for x in a), key=lambda t: t[1])
[('a', -11), ('a', -3), ('a', -1), ('b', 0), ('c', 1), ('c', 3), ('b', 100)]
Then turn it into a dict
with the dict construction function. Recall that takes a list of tuples of (key, value)
:
>>> dict(sorted((list(x.items())[0] for x in a), key=lambda t: t[1]))
{'a': -1, 'b': 100, 'c': 3}
Upvotes: 2
Reputation: 3612
You could do by iterating over all of your dicts and updating final dict new_a
with its content if given key isn't in new dict or its value is lower than original value.
a = [{u'a': 1}, {u'a': 2}, {u'a': 1}, {u'b': 2}, {u'b': 1}, {u'c': 1}, {u'c': 1}]
new_a = {}
for dict_ in a:
key, value = list(dict_.items())[0]
if key not in new_a or new_a[key] < value:
new_a[key] = value
print(new_a) # -> {'c': 1, 'b': 2, 'a': 2}
Upvotes: 1
Reputation: 2843
You could use a defaultdict
:
from collections import defaultdict
d = defaultdict(lambda: 0)
for val in a:
if d[val.keys()[0]] < val.values()[0]:
d[val.keys()[0]] = val.values()[0]
Output
{u'a': 2, u'b': 2, u'c': 1}
Upvotes: 1
Reputation: 61920
You could do:
a = [{u'a': 1}, {u'a': 2}, {u'a': 1}, {u'b': 2}, {u'b': 1}, {u'c': 1}, {u'c': 1}]
result = {}
for di in a:
for key, value in di.items():
result[key] = max(value, result.get(key, value))
print(result)
Output
{'a': 2, 'c': 1, 'b': 2}
Upvotes: 1