Reputation: 6149
I have a list of values that denote the number of times a piece of regex matches in a string. From this, I want to find the numbers that appear more than once, and their count. For example, for [2, 2, 2, 0, 2, 1, 3, 3]
I want {2:4,3:2}
as output if it's in a dict or [[2,4],[3,2]]
if it's in a list of lists. I'm looking for the fastest, most concise way to do this. Right now, I do it via the following code, but think it's way to verbose to be optimal.
numWinners=[2, 2, 2, 0, 2, 1]
tieCount={x:numWinners.count(x) for x in numWinners}
ties=dict()
for key, value in tieCount.items():
if value>1:
ties[key]=value
print ties
{2: 4, 3: 2}
A list or dict output isn't really an issue for me - again, whatever is fastest and concise.
Upvotes: 2
Views: 1512
Reputation: 96
Try using collections.defaultdict
import collections
ties = collections.defaultdict(lambda:0)
for num in numWinners:
ties[num] = ties[num]+1
for key,value in ties.iteritems():
print key, value
Upvotes: 0
Reputation: 11694
You can use a dict comprehension to create a histogram:
>>> ns=[2, 2, 2, 0, 2, 1, 3, 3]
>>> {x: ns.count(x) for x in set(ns) if ns.count(x) > 1}
{2: 4, 3: 2}
Upvotes: 2
Reputation: 353499
I'd combine collections.Counter
with a dictionary comprehension to select the duplicates:
>>> from collections import Counter
>>> numWinners = [2, 2, 2, 0, 2, 1, 3, 3]
>>> counts = Counter(numWinners)
>>> {k: v for k,v in counts.items() if v > 1}
{2: 4, 3: 2}
Upvotes: 5