Reputation: 20409
Let's say I have huge number of dictionaries (it could be 10'000 dictionaries). I would like to count number of each key in all dictionaries. I.e. if I have 3 dictionaries:
{1: 'url1', 3: 'url2', 7: 'url3', 5: 'url4'}
{1: 'url1', 7: 'url3'}
{5: 'url4', 10: 'url5'}
Then in result I should get {1: [2, 'url1'], 10: [1, 'url5'], 3: [1, 'url2'], 5: [2, 'url4'], 7: [2, 'url3']}
.
I came to the following code:
lists = [{1: 'url1', 3: 'url2', 7: 'url3', 5: 'url4'}, {1: 'url1', 7: 'url3'}, {5: 'url4', 10: 'url5'}]
result = {}
for l in lists:
for i in l:
if i in result:
result[i][0] += 1
else:
result[i] = [1, l[i]]
Is the any better (faster) way to do it?
Upvotes: 0
Views: 462
Reputation: 168626
If you can accept slightly different output, this might work for you:
from collections import Counter
dicts = [
{1: 'url1', 3: 'url2', 7: 'url3', 5: 'url4'},
{1: 'url1', 7: 'url3'},
{5: 'url4', 10: 'url5'},
]
result = Counter()
for d in dicts:
result.update(d.keys())
print dict(result)
Note that has keys and counts, but no values.
Alternatively:
from collections import Counter
from itertools import chain
dicts = [
{1: 'url1', 3: 'url2', 7: 'url3', 5: 'url4'},
{1: 'url1', 7: 'url3'},
{5: 'url4', 10: 'url5'},
]
result = Counter(chain.from_iterable(dicts))
print dict(result)
Final version: this one produces exactly your requested output:
from collections import Counter
from itertools import chain
dicts = [
{1: 'url1', 3: 'url2', 7: 'url3', 5: 'url4'},
{1: 'url1', 7: 'url3'},
{5: 'url4', 10: 'url5'},
]
result = Counter(chain.from_iterable(d.items() for d in dicts))
result = {k:[n,v] for ((k,v),n) in result.items()}
print dict(result)
Upvotes: 2