Reputation: 642
Let's say I have this dataframe :
0
0 1
1 1
2 1
3 2
4 2
5 3
6 3
7 1
8 1
I want to store the start and end indexes of each value (even repeated ones) in the dataframe as well as the value corresponding.
So that I would get a result like this for example :
Value | Start | End
----------------------------
1 | 0 | 2
2 | 3 | 4
3 | 5 | 6
1 | 7 | 8
I tried this (for the value 2 for example here) :
cs[['key']] = pd.DataFrame(cs.pop(0).values.tolist())
g = cs.groupby('key')
idx_start, idx_end = g.get_group(2).index[[0,-1]]
But this returns only first and last result each time.
Upvotes: 2
Views: 1028
Reputation: 78750
Given
>>> df
0
0 1
1 1
2 1
3 2
4 2
5 3
6 3
7 1
8 1
Solution:
starts_bool = df.diff().ne(0)[0]
starts = df.index[starts_bool]
ends = df.index[starts_bool.shift(-1, fill_value=True)]
result = (df.loc[starts]
.reset_index(drop=True)
.assign(Start=starts, End=ends)
.rename({0: 'Value'}, axis='columns')
)
Result:
>>> result
value Start End
0 1 0 2
1 2 3 4
2 3 5 6
3 1 7 8
Upvotes: 4