Reputation: 755
I have the following three dataframes:
df1 = pd.DataFrame(
{
"A_price": [10, 12, 15],
"B_price": [20, 19, 29],
"C_price": [23, 21, 4],
"D_price": [45, 47, 44],
},
index = ['01-01-2020', '01-02-2020', '01-03-2020']
)
df2 = pd.DataFrame(
{
"A_mid": [10, 12, 15],
"B_mid": [20, 19, 29],
"C_mid": [23, 21, 4],
"D_mid": [45, 47, 44],
},
index = ['01-01-2020', '01-02-2020', '01-03-2020']
)
df3 = pd.DataFrame(
{
"A_weight": [0.1, 0.2, 0.4],
"B_weight": [0.2, 0.5, 0.1],
"C_weight": [0.3, 0.2, 0.1],
"D_weight": [0.4, 0.1, 0.4],
},
index = ['01-01-2020', '01-02-2020', '01-03-2020']
)
I have defined the following function:
def price_weight(df1, df3):
df_price_weight = pd.merge(df1, df3, left_index=True, right_index=True)
if 'close' in df_price_weight.columns:
df_price_weight.filter(regex=('close|weight'))
df_price_weight.columns = df_price_weight.columns.str.split('_', expand=True)
df_price_weight = df_price_weight.sort_index(axis=1)
elif 'price' in df_price_weight.columns:
df_price_weight.filter(regex=('price|weight'))
df_price_weight.columns = df_price_weight.columns.str.split('_', expand=True)
df_price_weight.rename(columns={'price':'close'}, inplace=True)
df_price_weight = df_price_weight.sort_index(axis=1)
else:
df_price_weight.filter(regex=('mid|weight'))
df_price_weight.columns = df_price_weight.columns.str.split('_', expand=True)
df_price_weight.rename(columns={'mid':'close'}, inplace=True)
df_price_weight = df_price_weight.sort_index(axis=1)
return df_price_weight
For some reason, when I call price_weight(df1, df3), I don't get the right output. I should receive a dataframe with columns ['close', 'weight'], but I receive ['price', 'weight'].
How do I successfully define a function with multiple if statements to return the desired output?
UPDATE: I am trying to pass another function
def wmedian(dtfrm):
df = dtfrm.unstack().sort_values('close')
return df.loc[df['weight'].cumsum() > 0.5, 'close'].iloc[0]
where
dtfrm = price_weight(df1, df3)
The wmedian function should return a dataframe with close prices, but I am getting " KeyError: 'close' ".
What should I I change in the function?
Thank you.
Upvotes: 1
Views: 193
Reputation: 15505
The condition 'price' in df_price_weight.columns
is never going to be True, because the exact string 'price'
is not the name of a column.
Instead, I suggest:
any(('price' in column_name) for column_name in df_price_weight.columns)
Upvotes: 2