Reputation: 25
I have the following dataframe:
Vela FlgVela
0 R 0
1 V 1
2 V 1
3 R 1
4 R 1
5 V 0
6 R 1
7 R 1
8 R 1
What is the best way to get the result of the dataframe below?
Vela FlgVela AddCol
0 R 0 1
1 V 1 2
2 V 1 2
3 R 1 3
4 R 1 3
5 V 0 4
6 R 1 5
7 R 1 5
8 R 1 5
I have tried the following logic but the result is not what I expected.
df['AddCol'] = df.groupby(df['Vela'].astype(str).str.strip() != df['Vela'].shift(-1).astype(str).str.strip() ).cumcount()+1
Upvotes: 2
Views: 1030
Reputation: 71707
There is no need to group the dataframe. We can compare adjacent rows to create a boolean mask, then calculate cumulative sum on this mask to create a counter
df['AddCol'] = df['Vela'].ne(df['Vela'].shift()).cumsum()
Vela FlgVela AddCol
0 R 0 1
1 V 1 2
2 V 1 2
3 R 1 3
4 R 1 3
5 V 0 4
6 R 1 5
7 R 1 5
8 R 1 5
Upvotes: 2
Reputation: 18315
I think you're close, here is one way:
df["AddCol"] = df.groupby("Vela").ngroup().diff().ne(0).cumsum()
where we first get the group number each distinct Vela
belongs to (kind of factorize) then take the first differences and see if they are not equal to 0. This will sort of give the "turning" points from one group to another. Then we cumulatively sum them,
to get
>>> df
Vela FlgVela AddCol
0 R 0 1
1 V 1 2
2 V 1 2
3 R 1 3
4 R 1 3
5 V 0 4
6 R 1 5
7 R 1 5
8 R 1 5
Upvotes: 2