Kallol
Kallol

Reputation: 2189

how to find the last value of consecutive values in pandas dataframe?

I have a data frame like this

df:
col1     col2
 1        10
 1        20
 2        11
 3        33
 1        20
 1        10
 2        24
 3        21
 3        28

I want to group by this data frame on col1 where there is consecutive values, and take the last value for each consecutive groups,

The final data frame should look like:

df
col1    col2
 1       20
 2       11
 3       33
 1       10
 2       24
 3       28

I have tried something like:

 df['b_new'] = df.groupby('col1')['col2'].transform('last')

But its missing the consecutive condition.

How to implement it in most effective way using pandas/ python

Upvotes: 1

Views: 325

Answers (1)

jezrael
jezrael

Reputation: 862661

Use boolean indexing with filtering by Series.ne with Series.shifted Series with -1 for last dupe consecutive rows:

df1 = df[df['col1'].ne(df['col1'].shift(-1))]
print (df1)
   col1  col2
1     1    20
2     2    11
3     3    33
5     1    10
6     2    24
8     3    28

Detail:

print (df['col1'].ne(df['col1'].shift(-1)))
0    False
1     True
2     True
3     True
4    False
5     True
6     True
7    False
8     True
Name: col1, dtype: bool

Upvotes: 2

Related Questions