Struggling brat
Struggling brat

Reputation: 37

Percentage change between two rows in pandas based on certain criteria

I have a data frame:

price  value
10      B
10.3    H
11      H
11.5    S
11.1   
11      B
10.8    H
10      SL

I want to add another column change that takes a percentage of price change between one value after B and S | SL. The % can end after S or SL, whatever comes first.

Here's my intended outcome:

price  value  change
10      B       
10.3    H      0.116  # --> (11.5 - 10.3)/10.3 ; note that it starts from first H after B
11      H
11.5    S
11.1   
11      B      -0.074 # --> (10 - 10.8)/10.8 ; note that this one ends at SL instead of S
10.8    H
10      SL

I came with the following code, but it first makes a new column of daily change and adds only based on certain dates i.e. 10 days. I want % based on B, S, and SL values and with dynamic dates. So, this one doesn't do any good to me.

df['change'] = df[df.value== "B"].index.to_series().apply(lambda x: df.daychange.shift(-1).iloc[x: x + 10].sum())

Little help will be appreciated. Thanks!

Upvotes: 0

Views: 353

Answers (1)

Adam Kubalica
Adam Kubalica

Reputation: 128

Just to clarify: In your second output example the one value after B is 10.8 not 11, am I right? Also the value corresponding to price 11.1 is null? If so then this primitive loop may work:

df = pd.DataFrame({
     'price': [10, 10.3, 11, 11.5, 11.1, 11, 10.8, 10],
     'value': ['B', 'H','H', 'S',None, 'B','H', 'SL']
 })
B_list = df[df['value'] == 'B'].index
S_list = df[(df['value'] == 'S') | (df['value'] == 'SL' )].index
df['change'] = None
for el in range(len(B_list)):    
    x = (df['price'].iloc[B_list[el]+1] - df['price'].iloc[S_list[el]])/ df['price'].iloc[B_list[el]+1]
     df.loc[[B_list[el]+1], 'change'] = x

But it won't work if you have consecutive 'B's in the column (that's not the case in the above example)

Upvotes: 1

Related Questions