Reputation: 7536
Given this list of strings:
list=['foo','foo','foo','bar','bar','baz','baz','baz']
I'd like to get a list of the corresponding numbers as if this were an index with tied ranks like this:
numbers=[0,0,0,1,1,2,2,2]
Thanks in advance!
Upvotes: 0
Views: 21
Reputation: 155536
Assuming the strings are already grouped (all repeated strings are consecutive), the lowest overhead way to do this is with itertools.groupby
from itertools import groupby
numbers = [i for i, (_, g) in enumerate(groupby(mylist)) for _ in g]
This just groups the entries in mylist
(list
is a terrible name for a variable, shadowing the list
constructor), and produces i
(the 0-up count of groups seen so far) once for each entry in the group (we don't even care what the values are, thus for _ in g
to indicate the _
is unimportant).
If repeated values might be non-consecutive, but should have the same group number (that is, ['d', 'f', 'd']
might occur, and should produce [0, 1, 0]
rather than [0, 1, 2]
), you'd use a different approach (which would also work with the consecutive only case, but requires persistent and growing state that the groupby
approach avoids):
from collections import defaultdict
from itertools import count
# If key seen already, returns value, otherwise, returns next unused integer group number
grouptracker = defaultdict(count().__next__) # .next on Py2
numbers = [grouptracker[x] for x in mylist]
Or to one-line it for fun and inscrutability (don't actually do this):
numbers = list(map(defaultdict(count().__next__).__getitem__, mylist))
Upvotes: 1