fffrost
fffrost

Reputation: 1777

Better way to show only first instance in sequence of repeating values in a pandas dataframe?

When the value in my dataframe column A is 1 or -1, I want to store that value in a new column. When the following value is the same as previous (but not zero), I want to set it to zero. My code works exactly as I want, but I want to know if there is a more readable way of doing this?:

import pandas as pd

d = {'A':[0,0,1,1,1,-1,-1,-1,0,-1]}

df = pd.DataFrame(d)

df['match'] = df['A'].loc[~df['A'].eq(df['A'].shift())]
df['match'] = df['match'].fillna(0)
df
Out[1]: 
   A  match
0  0    0.0
1  0    0.0
2  1    1.0
3  1    0.0
4  1    0.0
5 -1   -1.0
6 -1    0.0
7 -1    0.0
8  0    0.0
9 -1   -1.0

Upvotes: 0

Views: 45

Answers (2)

ansev
ansev

Reputation: 30920

We can take advantage of Series.where to also fill in and avoid Series.fillna.

df['match']=df['A'].where(df['A'].ne(df['A'].shift()),0)
print(df)

Output

   A  match
0  0      0
1  0      0
2  1      1
3  1      0
4  1      0
5 -1     -1
6 -1      0
7 -1      0
8  0      0
9 -1     -1

Upvotes: 1

Erfan
Erfan

Reputation: 42916

Like stated in the comments, there's nothing wrong with your code right now. But here's another method for your convenience, using Series.where, Series.diff and Series.fillna:

df['match'] = df['A'].where(df['A'].diff().ne(0)).fillna(0)

   A  match
0  0    0.0
1  0    0.0
2  1    1.0
3  1    0.0
4  1    0.0
5 -1   -1.0
6 -1    0.0
7 -1    0.0
8  0    0.0
9 -1   -1.0

Upvotes: 1

Related Questions