RowanX
RowanX

Reputation: 1317

count the number of occurrences of a certain value in a dictionary in python?

If I have got something like this:

D = {'a': 97, 'c': 0 , 'b':0,'e': 94, 'r': 97 , 'g':0}

If I want for example to count the number of occurrences for the "0" as a value without having to iterate the whole list, is that even possible and how?

Upvotes: 40

Views: 110617

Answers (5)

boot-scootin
boot-scootin

Reputation: 12515

Alternatively, using collections.Counter:

from collections import Counter
D = {'a': 97, 'c': 0 , 'b':0,'e': 94, 'r': 97 , 'g':0}

Counter(D.values())[0]
# 3

Upvotes: 29

Sanket Suryawanshi
Sanket Suryawanshi

Reputation: 99

for i in hashmap:    
  print(Counter(hashmap.values())[hashmap[i]])

# In this way we can traverse & check the count with the help of Counter 

Upvotes: 1

Kasravnd
Kasravnd

Reputation: 107347

As mentioned in THIS ANSWER using operator.countOf() is the way to go but you can also use a generator within sum() function as following:

sum(value == 0 for value in D.values())
# Or the following which is more optimized 
sum(1 for v in D.values() if v == 0)

Or as a slightly more optimized and functional approach you can use map function by passing the __eq__ method of the integer as the constructor function.

sum(map((0).__eq__, D.values()))

Benchmark:

In [15]: D = dict(zip(range(1000), range(1000)))

In [16]: %timeit sum(map((0).__eq__, D.values()))
49.6 µs ± 770 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [17]: %timeit sum(v==0 for v in D.values())
60.9 µs ± 669 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [18]: %timeit sum(1 for v in D.values() if v == 0)
30.2 µs ± 515 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [19]: %timeit countOf(D.values(), 0)
16.8 µs ± 74.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Note that although using map function in this case may be more optimized, but in order to have a more comprehensive and general idea about the two approaches you should run the benchmark for relatively large datasets as well. Then, you can use the most proper approach based on the structure and amount of data you have.

Upvotes: 56

Kelly Bundy
Kelly Bundy

Reputation: 27640

That's a job for operator.countOf.

countOf(D.values(), 0)

Benchmark with your example dictionary:

1537 ns  1540 ns  1542 ns  Counter(D.values())[0]
 791 ns   800 ns   802 ns  sum(value == 0 for value in D.values())
 694 ns   697 ns   717 ns  sum(map((0).__eq__, D.values()))
 680 ns   682 ns   689 ns  sum(1 for value in D.values() if value == 0)
 599 ns   599 ns   600 ns  sum([1 for i in D.values() if i == 0])
 368 ns   369 ns   375 ns  list(D.values()).count(0)
 229 ns   231 ns   231 ns  countOf(D.values(), 0)

Code (Try it online!):

from timeit import repeat

setup = '''
from collections import Counter
from operator import countOf
D = {'a': 97, 'c': 0 , 'b':0,'e': 94, 'r': 97 , 'g':0}
'''

E = [
    'Counter(D.values())[0]',
    'sum(value == 0 for value in D.values())',
    'sum(map((0).__eq__, D.values()))',
    'sum(1 for value in D.values() if value == 0)',
    'sum([1 for i in D.values() if i == 0])',
    'list(D.values()).count(0)',
    'countOf(D.values(), 0)',
]

for _ in range(3):
    for e in E:
        number = 10 ** 5
        ts = sorted(repeat(e, setup, number=number))[:3]
        print(*('%4d ns ' % (t / number * 1e9) for t in ts), e)
    print()

Upvotes: 6

user1767754
user1767754

Reputation: 25154

You can count it converting it to a list as follows:

D = {'a': 97, 'c': 0 , 'b':0,'e': 94, 'r': 97 , 'g':0}
print(list(D.values()).count(0))
>>3

Or iterating over the values:

print(sum([1 for i in D.values() if i == 0]))
>>3

Upvotes: 13

Related Questions