SA12
SA12

Reputation: 318

Replace all duplicate rows with Nan or blank

I have a data frame (df) with different "price"s and I want to compare these prices and make a decision.

df['Decision'] = np.where((df['price1'] > df['price2']) ,'sell',np.where((df['price1'] < df['price2']),'buy',np.nan))

My output is:

price1 price2 Decision
50 50 NaN
100 200 buy
70 140 buy
150 200 buy
150 50 sell
60 20 sell
30 70 buy
60 100 buy

But I want to have just the "first signal" of "buy" or "sell" and delete replication until the next signal, as:

price1 price2 Decision
50 50 NaN
100 200 buy
70 140
150 200
150 50 sell
60 20
30 70 buy
60 100

Upvotes: 1

Views: 127

Answers (1)

jezrael
jezrael

Reputation: 862671

Use Series.where:

m = df['Decision'].ne(df['Decision'].shift()) 
df['Decision'] = df['Decision'].where(m, '')
print (df)
   price1  price2 Decision
0      50      50      NaN
1     100     200      buy
2      70     140         
3     150     200         
4     150      50     sell
5      60      20         
6      30      70      buy
7      60     100         

Or:

m = df['Decision'].ne(df['Decision'].shift()) 
df['Decision'] = np.where(m, df['Decision'], '')

Upvotes: 3

Related Questions