Amandeep Saluja
Amandeep Saluja

Reputation: 105

Calculate adjusted cost base using Python pandas ( Portfolio Analysis of Stocks Buy/Sell)

I am trying to do a portfolio analysis of my trading and trying to calculate the adjusted cost base price. I have tried almost everything but nothing seems to work. I am able to calculate the adjusted quantity but not able to get the adjusted buy price Could anyone please help?

Here is the Sample trade log raw data

import pandas as pd
import numpy as np

raw_data = {'Date':  ['04-23-2020', '05-05-2020', '05-05-2020', '05-11-2020', '05-11-2020', 
              '05-12-2020', '05-12-2020', '05-27-2020', '06-03-2020', '06-03-2020', 
              '06-03-2020', '06-03-2020', '06-03-2020'],
    'Type': ['Buy', 'Buy', 'Buy', 'Buy', 'Buy', 'Buy', 'Buy', 
             'Sell', 'Sell', 'Sell', 'Buy', 'Sell', 'Sell'],
    'Symbol': ['TSE:AC', 'TSE:AC', 'TSE:HEXO', 'TSE:BPY.UN', 'TSE:BPY.UN', 
               'TSE:BPY.UN', 'TSE:AC', 'TSE:BPY.UN', 'TSE:AC', 'TSE:BPY.UN', 
               'TSE:AC', 'TSE:BPY.UN', 'TSE:HEXO'],
    'Quantity': [75, 100, 1450, 200, 50, 80, 150, 100, 125, 100, 100, 50, 1450],
    'Amount per unit': [18.04, 17.29, 0.73, 13.04, 13.06, 12.65, 15.9, 15.01, 
                        18.05, 14.75, 15.8, 14.7, 1.07],
    'Turnover': [1353, 1729, 1058.5, 2608, 653, 1012, 2385, 1501, 2256.25, 1475, 1580, 735, 1551.5],
    }
df = pd.DataFrame (raw_data, columns = ['Date','Type','Symbol','Quantity','Amount per unit', 'Turnover']).sort_values(['Date','Symbol']).reset_index(drop = True)

I am able to get the Adjusted quantity without any issues but I am not able to get the correct Adjusted Price Per Unit. The condition here is if I sell a stock, my Adjusted Price Per Unit should not change and remain the same as the last adjusted price when the buy was made for that stock.

#to calculate adjusted quantity. this works as expected
df['Adjusted Quantity'] = df.apply(lambda x: ((x.Type == "Buy") - (x.Type == "Sell")) * x['Quantity'], axis = 1)
df['Adjusted Quantity'] = df.groupby('Symbol')['Adjusted Quantity'].cumsum()


#section where I am having problem. Works good until I reach the row where sell was made
df['Adjusted Price Per Unit'] = df.apply(lambda x: ((x.Type == "Buy") - (x.Type == "Sell")) * x['Turnover'], axis = 1)
df['Adjusted Price Per Unit'] = df.groupby('Symbol')['Adjusted Price Per Unit'].cumsum().div(df['Adjusted Quantity'])

Running this code will result in the following

enter image description here

For eg:, the adjusted price for the row at index 7 should be 12.948 (same as row at index 6) instead of 12.052. Also, the last row adjusted price should be 0.73 (same as row at index 2) since I am buying and selling same amount of shares of the stock.

Eg 2: At index 6, I bought 80 shares of BPY at 12.65 which brought my average price down to 12.94 for a total of 330 shares(250+80). Now, I'm selling 100 shares at 15.01(index 7). My code brings it adjusted cost to 12.05. What I need my adjusted cost to be 12.94 instead of 12.05. Simply put, ignore adjusting the price if the transaction type is Sell. Use the last adjusted price in the last buy type transaction for that particular stock.

The last 2 lines of my code are not correct. Could you please help me with correctly calculating the adjusted price per unit? Thanks :)

Upvotes: 2

Views: 2242

Answers (2)

Stanley
Stanley

Reputation: 839

The above answer by @r-beginners does not properly treat new purchases after quantity is 0 as observed by @guialmachado and @Rene Chan. Hence, here is my code which does properly calculate it, using Rene Chan's example.

import pandas as pd
import numpy as np

raw_data = {'Date':  ['04-23-2020', '05-05-2020', '05-05-2020', '05-11-2020', '05-11-2020', 
              '05-12-2020', '05-12-2020', '05-27-2020', '06-03-2020', '06-03-2020', 
              '06-03-2020', '06-03-2020', '06-03-2020', '06-05-2020'],
    'Type': ['Buy', 'Buy', 'Buy', 'Buy', 'Buy', 'Buy', 'Buy', 
             'Sell', 'Sell', 'Sell', 'Buy', 'Sell', 'Sell', 'Buy'],
    'Symbol': ['TSE:AC', 'TSE:AC', 'TSE:HEXO', 'TSE:BPY.UN', 'TSE:BPY.UN', 
               'TSE:BPY.UN', 'TSE:AC', 'TSE:BPY.UN', 'TSE:AC', 'TSE:BPY.UN', 
               'TSE:AC', 'TSE:BPY.UN', 'TSE:HEXO', 'TSE:HEXO'],
    'Quantity': [75, 100, 1450, 200, 50, 80, 150, 100, 125, 100, 100, 50, 1450, 3000],
    'Amount per unit': [18.04, 17.29, 0.73, 13.04, 13.06, 12.65, 15.9, 15.01, 
                        18.05, 14.75, 15.8, 14.7, 1.07, 2.50],
    'Turnover': [1353, 1729, 1058.5, 2608, 653, 1012, 2385, 1501, 2256.25, 1475, 1580, 735, 1551.5, 7500],
    }
df = pd.DataFrame (raw_data, columns = ['Date','Type','Symbol','Quantity','Amount per unit', 'Turnover']).sort_values(['Date','Symbol']).reset_index(drop = True)

df.sort_values(['Symbol','Date','Type'], ascending=[True, True, True], inplace=True)

df['Adjusted Quantity'] = df.apply(lambda x: ((x.Type == "Buy") - (x.Type == "Sell")) * x['Quantity'], axis = 1)
df['Adjusted Quantity'] = df.groupby('Symbol')['Adjusted Quantity'].cumsum()

def calculate_adjusted_price_per_unit(group):
    group['Adjusted Turnover'] = group.apply(lambda x: ((x.Type == "Buy") - (x.Type == "Sell")) * x['Turnover'], axis = 1)
    group['Adjusted Price Per Unit'] = 0
    last_entry_price = 0
    
    current_turnover = 0
    for index, row in group.iterrows():
        current_turnover += row['Adjusted Turnover']
        
        if row['Type'] == 'Buy':
            group.at[index, 'Adjusted Price Per Unit'] = current_turnover / row['Adjusted Quantity']

        if row['Type'] == 'Sell':
            group.at[index, 'Adjusted Price Per Unit'] = last_entry_price
        
        if row['Adjusted Quantity'] == 0:
            current_turnover = 0
            group.at[index, 'Adjusted Price Per Unit'] = last_entry_price
    
        last_entry_price = group.at[index, 'Adjusted Price Per Unit']
    
    return group

df = df.groupby('Symbol').apply(calculate_adjusted_price_per_unit)
df.drop(columns=['Adjusted Turnover'], inplace=True)

which correctly outputs

          Date  Type      Symbol  Quantity  Amount per unit  Turnover  Adjusted Quantity  Adjusted Price Per Unit
0   04-23-2020   Buy      TSE:AC        75            18.04   1353.00                 75                18.040000
1   05-05-2020   Buy      TSE:AC       100            17.29   1729.00                175                17.611429
5   05-12-2020   Buy      TSE:AC       150            15.90   2385.00                325                16.821538
9   06-03-2020   Buy      TSE:AC       100            15.80   1580.00                425                16.581176
8   06-03-2020  Sell      TSE:AC       125            18.05   2256.25                300                16.581176
3   05-11-2020   Buy  TSE:BPY.UN       200            13.04   2608.00                200                13.040000
4   05-11-2020   Buy  TSE:BPY.UN        50            13.06    653.00                250                13.044000
6   05-12-2020   Buy  TSE:BPY.UN        80            12.65   1012.00                330                12.948485
7   05-27-2020  Sell  TSE:BPY.UN       100            15.01   1501.00                230                12.948485
10  06-03-2020  Sell  TSE:BPY.UN       100            14.75   1475.00                130                12.948485
11  06-03-2020  Sell  TSE:BPY.UN        50            14.70    735.00                 80                12.948485
2   05-05-2020   Buy    TSE:HEXO      1450             0.73   1058.50               1450                 0.730000
12  06-03-2020  Sell    TSE:HEXO      1450             1.07   1551.50                  0                 0.730000
13  06-05-2020   Buy    TSE:HEXO      3000             2.50   7500.00               3000                 2.500000

Upvotes: 2

r-beginners
r-beginners

Reputation: 35135

If you don't calculate the adjusted price for the sale, as you would comment, then you can process the sale line as NA and fill it in with the immediately preceding value of the same stock. As a confirmation in your code, do you not need to consider the same stock when calculating the 'Adjusted Quantity' at the beginning?

df.sort_values(['Symbol','Date','Type'], ascending=[True, True, True], inplace=True)
# your code
df['Adjusted Quantity'] = df.apply(lambda x: ((x.Type == "Buy") - (x.Type == "Sell")) * x['Quantity'], axis = 1)
df['Adjusted Quantity'] = df.groupby('Symbol')['Adjusted Quantity'].cumsum()
df['Adjusted Price Per Unit'] = df.apply(lambda x: ((x.Type == "Buy") - (x.Type == "Sell")) * x['Turnover'], axis = 1)
df['Adjusted Price Per Unit'] = df.groupby('Symbol')['Adjusted Price Per Unit'].cumsum().div(df['Adjusted Quantity'])

df.loc[df['Type'] == 'Sell',['Adjusted Price Per Unit']] = np.NaN
df.fillna(method='ffill', inplace=True)

|    | Date       | Type   | Symbol     |   Quantity |   Amount per unit |   Turnover |   Adjusted Quantity |   Adjusted Price Per Unit |
|---:|:-----------|:-------|:-----------|-----------:|------------------:|-----------:|--------------------:|--------------------------:|
|  0 | 04-23-2020 | Buy    | TSE:AC     |         75 |             18.04 |    1353    |                  75 |                   18.04   |
|  1 | 05-05-2020 | Buy    | TSE:AC     |        100 |             17.29 |    1729    |                 175 |                   17.6114 |
|  5 | 05-12-2020 | Buy    | TSE:AC     |        150 |             15.9  |    2385    |                 325 |                   16.8215 |
|  9 | 06-03-2020 | Buy    | TSE:AC     |        100 |             15.8  |    1580    |                 425 |                   16.5812 |
|  8 | 06-03-2020 | Sell   | TSE:AC     |        125 |             18.05 |    2256.25 |                 300 |                   16.5812 |
|  3 | 05-11-2020 | Buy    | TSE:BPY.UN |        200 |             13.04 |    2608    |                 200 |                   13.04   |
|  4 | 05-11-2020 | Buy    | TSE:BPY.UN |         50 |             13.06 |     653    |                 250 |                   13.044  |
|  6 | 05-12-2020 | Buy    | TSE:BPY.UN |         80 |             12.65 |    1012    |                 330 |                   12.9485 |
|  7 | 05-27-2020 | Sell   | TSE:BPY.UN |        100 |             15.01 |    1501    |                 230 |                   12.9485 |
| 10 | 06-03-2020 | Sell   | TSE:BPY.UN |        100 |             14.75 |    1475    |                 130 |                   12.9485 |
| 11 | 06-03-2020 | Sell   | TSE:BPY.UN |         50 |             14.7  |     735    |                  80 |                   12.9485 |
|  2 | 05-05-2020 | Buy    | TSE:HEXO   |       1450 |              0.73 |    1058.5  |                1450 |                    0.73   |
| 12 | 06-03-2020 | Sell   | TSE:HEXO   |       1450 |              1.07 |    1551.5  |                   0 |                    0.73   |

Upvotes: 6

Related Questions