CentauriAurelius
CentauriAurelius

Reputation: 504

find indices of groups with same string value

I have a list of strings that can only take on 4 different string values, e.g.:

y = ['human', 'human', 'human', 'agent', 'agent', 'player', 'player', 'player', 'opponent', 'opponent', 'opponent', 'human', 'human', 'player', 'player', 'player'] 

I need to get the indices for the groups, something like:

human_idx = [(0, 2), (11, 12)]
agent_idx = [(3, 4)]
player_idx = [(5, 7), (13, 15)]
opponent_idx = [(8, 10)]

I found a solution to this if it was a numpy array of 0s and 1s

but I am working with a list of strings.

Upvotes: 0

Views: 100

Answers (1)

Austin
Austin

Reputation: 26039

Making variables like that is not advised. You can create a dictionary instead. This is possible using groupby and defaultdict:

from itertools import groupby
from collections import defaultdict

y = ['human', 'human', 'human', 'agent', 'agent', 'player', 'player', 'player', 'opponent', 'opponent', 'opponent', 'human', 'human', 'player', 'player', 'player']

i = 0
result = defaultdict(list)
for k, g in groupby(y):
    elems = len(list(g))
    result[k].append((i, i+elems-1))
    i += elems

print(result)

# defaultdict(<class 'list'>, 
#             {'human': [(0, 2), (11, 12)],
#              'agent': [(3, 4)],
#              'player': [(5, 7), (13, 15)],
#              'opponent': [(8, 10)]})

Upvotes: 1

Related Questions