Reputation: 2189
I have a data frame like this
df:
col1 col2
1 10
1 20
2 11
3 33
1 20
1 10
2 24
3 21
3 28
I want to group by this data frame on col1 where there is consecutive values, and take the last value for each consecutive groups,
The final data frame should look like:
df
col1 col2
1 20
2 11
3 33
1 10
2 24
3 28
I have tried something like:
df['b_new'] = df.groupby('col1')['col2'].transform('last')
But its missing the consecutive condition.
How to implement it in most effective way using pandas/ python
Upvotes: 1
Views: 325
Reputation: 862661
Use boolean indexing
with filtering by Series.ne
with Series.shift
ed Series with -1
for last dupe consecutive rows:
df1 = df[df['col1'].ne(df['col1'].shift(-1))]
print (df1)
col1 col2
1 1 20
2 2 11
3 3 33
5 1 10
6 2 24
8 3 28
Detail:
print (df['col1'].ne(df['col1'].shift(-1)))
0 False
1 True
2 True
3 True
4 False
5 True
6 True
7 False
8 True
Name: col1, dtype: bool
Upvotes: 2