Reputation: 504
I have a list of strings that can only take on 4 different string values, e.g.:
y = ['human', 'human', 'human', 'agent', 'agent', 'player', 'player', 'player', 'opponent', 'opponent', 'opponent', 'human', 'human', 'player', 'player', 'player']
I need to get the indices for the groups, something like:
human_idx = [(0, 2), (11, 12)]
agent_idx = [(3, 4)]
player_idx = [(5, 7), (13, 15)]
opponent_idx = [(8, 10)]
I found a solution to this if it was a numpy array of 0s and 1s
but I am working with a list of strings.
Upvotes: 0
Views: 100
Reputation: 26039
Making variables like that is not advised. You can create a dictionary instead. This is possible using groupby
and defaultdict
:
from itertools import groupby
from collections import defaultdict
y = ['human', 'human', 'human', 'agent', 'agent', 'player', 'player', 'player', 'opponent', 'opponent', 'opponent', 'human', 'human', 'player', 'player', 'player']
i = 0
result = defaultdict(list)
for k, g in groupby(y):
elems = len(list(g))
result[k].append((i, i+elems-1))
i += elems
print(result)
# defaultdict(<class 'list'>,
# {'human': [(0, 2), (11, 12)],
# 'agent': [(3, 4)],
# 'player': [(5, 7), (13, 15)],
# 'opponent': [(8, 10)]})
Upvotes: 1