LA_
LA_

Reputation: 20409

How to count number of repeated keys in several dictionaries?

Let's say I have huge number of dictionaries (it could be 10'000 dictionaries). I would like to count number of each key in all dictionaries. I.e. if I have 3 dictionaries:

Then in result I should get {1: [2, 'url1'], 10: [1, 'url5'], 3: [1, 'url2'], 5: [2, 'url4'], 7: [2, 'url3']}.

I came to the following code:

lists = [{1: 'url1', 3: 'url2', 7: 'url3', 5: 'url4'}, {1: 'url1', 7: 'url3'}, {5: 'url4', 10: 'url5'}]
result = {}
for l in lists:
    for i in l:
        if i in result:
            result[i][0] += 1
        else:
            result[i] = [1, l[i]]

Is the any better (faster) way to do it?

Upvotes: 0

Views: 462

Answers (1)

Robᵩ
Robᵩ

Reputation: 168626

If you can accept slightly different output, this might work for you:

from collections import Counter

dicts = [
    {1: 'url1', 3: 'url2', 7: 'url3', 5: 'url4'},
    {1: 'url1', 7: 'url3'},
    {5: 'url4', 10: 'url5'},
]

result = Counter()
for d in dicts:
    result.update(d.keys())

print dict(result)

Note that has keys and counts, but no values.

Alternatively:

from collections import Counter
from itertools import chain

dicts = [
    {1: 'url1', 3: 'url2', 7: 'url3', 5: 'url4'},
    {1: 'url1', 7: 'url3'},
    {5: 'url4', 10: 'url5'},
]

result = Counter(chain.from_iterable(dicts))

print dict(result)

Final version: this one produces exactly your requested output:

from collections import Counter
from itertools import chain

dicts = [
    {1: 'url1', 3: 'url2', 7: 'url3', 5: 'url4'},
    {1: 'url1', 7: 'url3'},
    {5: 'url4', 10: 'url5'},
]

result = Counter(chain.from_iterable(d.items() for d in dicts))
result = {k:[n,v] for ((k,v),n) in result.items()}

print dict(result)

Upvotes: 2

Related Questions