czzz0414
czzz0414

Reputation: 67

questions about groupby dataframe

Here is part of the data of scaffold_table enter image description here

I would like to create a summary table that returns the overall yield per ticker. The overall yield should be calculated as the total PnL per ticker divided by the last date's position per ticker

# Create a summary table of your average daily PnL, total PnL, and overall yield per ticker
summary_table = pd.DataFrame(scaffold_table.groupby(['Date','Ticker'])['Daily PnL'].mean())

position_ticker = pd.DataFrame(scaffold_table.groupby(['Date','Ticker'])['Position'].sum())

# the total PnL is the sum of PnL per Ticker after two years period
totals = summary_table.droplevel('Date').groupby('Ticker').sum().rename(columns={'Daily PnL':'total PnL'})

summary_table = summary_table.join(totals, on='Ticker')
summary_table = summary_table.join(position_ticker, on = ['Date','Ticker'], how='inner')
summary_table['Yield'] = summary_table.loc['2022-04-29']['total PnL']/summary_table.loc['2022-04-29']['Position']
summary_table

But the yield is showing NaN, could anyone take a look at my codes? I used ['2022-04-29'] because it is the last date, but I think there are some codes to return the last date without explicitly inputting that. enter image description here

Upvotes: 0

Views: 31

Answers (1)

constantstranger
constantstranger

Reputation: 9389

Here's a way to do what your question asks:

# Create a summary table of your average daily PnL, total PnL, and overall yield per ticker
summary_table = pd.DataFrame(scaffold_table.groupby(['Date','Ticker'])['Daily PnL'].mean())

# the total PnL is the sum of PnL per Ticker after two years period
totals = summary_table.droplevel('Date').groupby('Ticker').sum().rename(columns={'Daily PnL':'total PnL'})

summary_table = summary_table.join(totals, on='Ticker')

Full test code:

import pandas as pd
scaffold_table = pd.DataFrame({
'Position':[2000]*5, 
'Company':['Amazon', 'Amazon', 'Alphabet', 'Amazon', 'Alphabet'], 
'Date':['2020-05-26','2020-05-27','2020-05-27','2020-05-28','2020-05-28'],
'Ticker':['AMZN','AMZN','GOOG','AMZN','GOOG'],
'Open':[2458.,2404.9899,1417.25,2384.330078,1396.859985],
'Volume':[3568200,5056900,1685800,3190200,1692200],
'Daily Return':[-0.006164,-0.004736,0.000579,-0.003854,-0.000783],
'Daily PnL':[-12.327054,-9.472236,1.157283,-7.708126,-1.565741],
'Cumulative PnL/Ticker':[-12.327054,-21.799290,1.157283,-29.507417,-0.408459]})

print('\nscaffold_table:'); print(scaffold_table)

# Create a summary table of your average daily PnL, total PnL, and overall yield per ticker
summary_table = pd.DataFrame(scaffold_table.groupby(['Date','Ticker'])['Daily PnL'].mean())
print('\nsummary_table:'); print(summary_table)
# the total PnL is the sum of PnL per Ticker after two years period
totals = summary_table.droplevel('Date').groupby('Ticker').sum().rename(columns={'Daily PnL':'total PnL'})
print('\ntotals:'); print(totals)
summary_table = summary_table.join(totals, on='Ticker')
print('\nsummary_table:'); print(summary_table)

Input:


scaffold_table:
   Position   Company        Date Ticker         Open   Volume  Daily Return  Daily PnL  Cumulative PnL/Ticker
0      2000    Amazon  2020-05-26   AMZN  2458.000000  3568200     -0.006164 -12.327054             -12.327054
1      2000    Amazon  2020-05-27   AMZN  2404.989900  5056900     -0.004736  -9.472236             -21.799290
2      2000  Alphabet  2020-05-27   GOOG  1417.250000  1685800      0.000579   1.157283               1.157283
3      2000    Amazon  2020-05-28   AMZN  2384.330078  3190200     -0.003854  -7.708126             -29.507417
4      2000  Alphabet  2020-05-28   GOOG  1396.859985  1692200     -0.000783  -1.565741              -0.408459

Output:

summary_table:
                   Daily PnL
Date       Ticker
2020-05-26 AMZN   -12.327054
2020-05-27 AMZN    -9.472236
           GOOG     1.157283
2020-05-28 AMZN    -7.708126
           GOOG    -1.565741

totals:
        total PnL
Ticker
AMZN   -29.507416
GOOG    -0.408458

summary_table:
                   Daily PnL  total PnL
Date       Ticker
2020-05-26 AMZN   -12.327054 -29.507416
2020-05-27 AMZN    -9.472236 -29.507416
           GOOG     1.157283  -0.408458
2020-05-28 AMZN    -7.708126 -29.507416
           GOOG    -1.565741  -0.408458

Upvotes: 1

Related Questions