Reputation: 1072
So I have a list of strings:
list1 = ["1thing", "2thing", "3thing", "1thing"]
and I want to find out how many times each one is in the list. The thing is, I only want to compare the first couple of characters because I know that if the first, say 3 characters are the same, then the whole string is the same. I was thinking that I could modify the built in list.count(x) method, or I could override the __eq__
operator but I'm not sure how to do either of those.
Upvotes: 3
Views: 13776
Reputation: 49866
Use a generator to extract the first couple of characters, and use the builtin collections.Counter
class on that:
Counter(item[:2] for item in list1)
Upvotes: 9
Reputation: 7965
Probably not as good as a solution as @Marcin's, but using itertools.groupby
might make it more readable and flexible.
from itertools import groupby
def group_by_startswith(it, n):
"""Get a dict mapping the first n characters to the number of matches."""
def first_n(str_):
return str_[:n]
startswith_sorted = sorted(it, key=first_n)
groups = groupby(startswith_sorted, key=first_n)
return {key: len(list(grouped)) for key, grouped in groups}
Example Output:
>>> list1 = ["1thing", "2thing", "3thing", "1thing"]
>>> print(group_by_startswith(list1, 3))
{'3th': 1, '2th': 1, '1th': 2}
This solution allows you a little more flexibility with the result. For example, modifying the return line to return grouped
or list(grouped)
allows you to easily get the matching objects.
Upvotes: 1
Reputation: 7281
why go through all the hastle..use the collections.Counter
module to find frequencies.
>>> import collections
>>> x=['1thing', '2thing', '1thing', '3thing']
>>> y=collections.Counter(x)
>>> y
Counter({'1thing': 2, '2thing': 1, '3thing': 1})
Upvotes: 5