Reputation: 581
I have a Dataframe as shown:
I have to normalise the Close column of dataframe as follows:
For Each symbol we have to divide the close price on succeeding days by the First close price. This means the normalised closed for APLAPOLLO on 24/11/2020 will be calculated as:
Normalised_Close_price=(Close on 24/11/2020)/(Close on 23/11/2020)=0.9915( Valid Only For Aplapollo)
Now if the symbol changes the method remains same only the values in the above formulae changes. Hence For AUBANK the normalised closed on 24/11/2020 will be calculated as:
Normalised_Close_price=(Close on 24/11/2020)/(Close on 23/11/2020)=0.9915( Valid Only for AUBANK)
The normalised close for other symbol should be calculated in the same way.
Hence after the calculation the dataframe should look as shown:
eg: For Aplapollo from above explanation:
Similarly for other symbols
I have referred to the following answer on normalising dataframe columns:
Normalize columns of pandas data frame
This does not help as the values are changing as per the condition in my case
Upvotes: 1
Views: 270
Reputation: 13349
Try:
df1['Normalize'] = df1.groupby('Symbol')['Close'].transform(lambda x: x/x.iloc[0]).fillna(1)#.reset_index()
As commented by Shubham:
you can divide groups by its first value by
df['Close'] /= df1.groupby('Symbol')['Close'].transform('first')
:)
df1:
Date Symbol Close Normalize
0 2020-11-23 APLAPOLLO 3247.45 1.000000
1 2020-11-24 APLAPOLLO 3219.95 0.991532
2 2020-11-25 APLAPOLLO 3220.45 0.991686
3 2020-11-26 APLAPOLLO 3178.95 0.978907
4 2020-11-27 APLAPOLLO 3378.90 1.040478
5 2020-12-01 APLAPOLLO 3446.85 1.061402
6 2020-12-02 APLAPOLLO 3514.55 1.082249
7 2020-12-03 APLAPOLLO 3545.80 1.091872
8 2020-12-04 APLAPOLLO 3708.60 1.142004
9 2020-12-07 APLAPOLLO 3868.55 1.191258
10 2020-12-08 APLAPOLLO 3750.30 1.154845
11 2020-12-09 APLAPOLLO 3801.35 1.170565
12 2020-12-10 APLAPOLLO 3766.65 1.159879
13 2020-12-11 APLAPOLLO 3674.30 1.131442
14 2020-12-14 APLAPOLLO 3814.80 1.174706
15 2020-12-15 APLAPOLLO 780.55 0.240358
16 2020-12-16 APLAPOLLO 790.20 0.243329
17 2020-12-17 APLAPOLLO 791.20 0.243637
18 2020-12-18 APLAPOLLO 769.70 0.237017
19 2020-12-21 APLAPOLLO 726.60 0.223745
20 2020-12-22 APLAPOLLO 744.30 0.229195
21 2020-11-23 AUBANK 869.65 1.000000
22 2020-11-24 AUBANK 874.35 1.005404
23 2020-11-25 AUBANK 856.25 0.984592
24 2020-11-26 AUBANK 861.05 0.990111
25 2020-11-27 AUBANK 839.05 0.964813
26 2020-12-01 AUBANK 872.90 1.003737
27 2020-12-02 AUBANK 886.65 1.019548
28 2020-12-03 AUBANK 880.30 1.012246
29 2020-12-04 AUBANK 880.45 1.012419
30 2020-12-07 AUBANK 898.65 1.033347
31 2020-12-08 AUBANK 907.80 1.043868
32 2020-12-09 AUBANK 918.90 1.056632
33 2020-12-10 AUBANK 911.05 1.047605
34 2020-12-11 AUBANK 920.30 1.058242
35 2020-12-14 AUBANK 929.45 1.068763
36 2020-12-15 AUBANK 922.60 1.060887
37 2020-12-16 AUBANK 915.80 1.053067
38 2020-12-17 AUBANK 943.15 1.084517
39 2020-12-18 AUBANK 897.00 1.031449
40 2020-12-21 AUBANK 840.45 0.966423
41 2020-12-22 AUBANK 856.00 0.984304
42 2020-11-23 AARTIDRUGS 711.70 1.000000
Upvotes: 3
Reputation: 400
I think what you are looking for is this.
df["Normalise"] = df.groupby('Symbol', sort=False)['Close'].rolling(2).apply(lambda x: x.iloc[1] / x.iloc[0]).reset_index(0, drop=True)
Upvotes: 1