MathMan 99
MathMan 99

Reputation: 755

Python: define function with multiple if statements and return output

I have the following three dataframes:

df1 = pd.DataFrame(
{
"A_price": [10, 12, 15],
"B_price": [20, 19, 29],
"C_price": [23, 21, 4],
"D_price": [45, 47, 44],
},
index = ['01-01-2020', '01-02-2020', '01-03-2020']
)
df2 = pd.DataFrame(
{
"A_mid": [10, 12, 15],
"B_mid": [20, 19, 29],
"C_mid": [23, 21, 4],
"D_mid": [45, 47, 44],
},
index = ['01-01-2020', '01-02-2020', '01-03-2020']
)
df3 = pd.DataFrame(
{
"A_weight": [0.1, 0.2, 0.4],
"B_weight": [0.2, 0.5, 0.1],
"C_weight": [0.3, 0.2, 0.1],
"D_weight": [0.4, 0.1, 0.4],
},
index = ['01-01-2020', '01-02-2020', '01-03-2020']
)

I have defined the following function:

def price_weight(df1, df3):

    df_price_weight = pd.merge(df1, df3, left_index=True, right_index=True)
    if 'close' in df_price_weight.columns:
        df_price_weight.filter(regex=('close|weight'))
        df_price_weight.columns = df_price_weight.columns.str.split('_', expand=True)
        df_price_weight = df_price_weight.sort_index(axis=1)

    elif 'price' in df_price_weight.columns:
        df_price_weight.filter(regex=('price|weight'))
        df_price_weight.columns = df_price_weight.columns.str.split('_', expand=True)
        df_price_weight.rename(columns={'price':'close'}, inplace=True)
        df_price_weight = df_price_weight.sort_index(axis=1)
    
    else:
        df_price_weight.filter(regex=('mid|weight'))
        df_price_weight.columns = df_price_weight.columns.str.split('_', expand=True)
        df_price_weight.rename(columns={'mid':'close'}, inplace=True)
        df_price_weight = df_price_weight.sort_index(axis=1)

    return df_price_weight

For some reason, when I call price_weight(df1, df3), I don't get the right output. I should receive a dataframe with columns ['close', 'weight'], but I receive ['price', 'weight'].

How do I successfully define a function with multiple if statements to return the desired output?

UPDATE: I am trying to pass another function

def wmedian(dtfrm):
    df = dtfrm.unstack().sort_values('close')
    return df.loc[df['weight'].cumsum() > 0.5, 'close'].iloc[0]

where

dtfrm = price_weight(df1, df3)

The wmedian function should return a dataframe with close prices, but I am getting " KeyError: 'close' ".

What should I I change in the function?

Thank you.

Upvotes: 1

Views: 193

Answers (1)

Stef
Stef

Reputation: 15505

The condition 'price' in df_price_weight.columns is never going to be True, because the exact string 'price' is not the name of a column.

Instead, I suggest:

any(('price' in column_name) for column_name in df_price_weight.columns)

Upvotes: 2

Related Questions