Jovoszhou
Jovoszhou

Reputation: 111

python pandas dataframe merge two rows

I'm trying to make an option T-table. I have a pandas dataframe like this:

             name             contract   optionName      call_or_put     IV
0             Crude Oil         CL2009     CL2009C42         call       51.25%
1             Crude Oil         CL2009     CL2009P42         put        52.13%
2             Gold              GC2009     GC2009C1900       call       20%
3             Gold              GC2009     GC2009P1900       put        22%

And I would like to change it to like this:

      name          contract       call          IV_call       put         IV_put
0    Crude Oil       CL2009       CL2009C42       51.25%     CL2009P42      52.13%
1    Gold            GC2009       GC2009C1900      20%       GC2009P1900     22%

I'm thinking to seperate the df to two dataframes, and use combine_first. But I haven't been able to make it work. Any ideas? Thanks in advance.

Upvotes: 0

Views: 300

Answers (3)

Sumit
Sumit

Reputation: 11

One way to do this is to simply divide dataframe into two different dataframe.

df = pd.DataFrame({'name': ['crude oil', 'crude oil', 'gold', 'gold'],
                     'contract': ['CL2009', 'CL2009', 'GC2009', 'GC2009'],
                     'optionName': ['CL2009C42', 'CL2009P42', 'GC2009C1900', 'GC2009P1900'],
                     'call_or_put': ['call', 'put', 'call', 'put'],
                     'IV': ['51.25%', '52.13%', '20%', '22%']},
                    index=[0, 1, 2, 3])

#dropping the duplicate i.e creating a new frame.
new1 = df.drop_duplicates(subset ="name", keep = 'last')

#creating new frame by removing the first frame from the original frame.
new2 = df.drop(new1['optionName'].index)
# final frame
final_df = pd.merge(new2 , new1, on=['name' , 'contract'])
final_df.drop(columns = ['call_or_put_x','call_or_put_y'] , inplace=True)
final_df
# you can rename column name to get your desired result.

Upvotes: 1

sushanth
sushanth

Reputation: 8302

try this, using pandas.concat

import pandas as pd

print(
    pd.concat([
        df[df.call_or_put.eq("call")]
            .rename(columns={"IV": "IV_call", "optionName": "call"}).set_index("name"),
        df.loc[df.call_or_put.eq("put"), ['name', 'IV', 'optionName']]
            .rename(columns={"IV": "IV_put", "optionName": "put"}).set_index("name")
    ], axis=1)
        .drop(columns=['call_or_put']).reset_index()  # drop un-wanted column's
)

        name contract         call IV_call  IV_put          put
0  Crude Oil   CL2009    CL2009C42  51.25%  52.13%    CL2009P42
1       Gold   GC2009  GC2009C1900     20%     22%  GC2009P1900

Upvotes: 0

jsmart
jsmart

Reputation: 3001

Here is a function to extract puts (only) or calls (only) from the original data frame. The 'call data frame' and the 'put data frame' are combined with pd.merge()

def df_to_option(df, option_type):
    return (df.query('call_or_put == @option_type')
              .rename(columns = {'IV': f'IV_{option_type}',
                               'optionName': option_type})
              .drop(columns='call_or_put')
         )

pd.merge(left = df_to_option(df, 'call'),
         right = df_to_option(df, 'put'),
         how = 'inner',
         on = ['name', 'contract'],
        )

Here is the result of the pd.merge() command:

   name        contract   call         IV_call    put          IV_put
0  Crude Oil   CL2009     CL2009C42    51.25%     CL2009P42    52.13%
1  Gold        GC2009     GC2009C1900  20%        GC2009P1900  22%

Upvotes: 0

Related Questions