Reputation: 1255
I have two pandas dataframe
import pandas as pd
import numpy as np
import datetime
data = {'group' :["A","A","B","B"],
'val': ["AA","AB","B1","B2"],
'cal1' :[4,5,7,6],
'cal2' :[10,100,100,10]
}
df1 = pd.DataFrame(data)
df1
group val cal1 cal2
0 A AA 4 10
1 A AB 5 100
2 B B1 7 100
3 B B2 6 10
data = {'group' :["A","A","A","B","B","B","B", "B", "B", "B"],
'flag' : [1,0,0,1,0,0,0, 1, 0, 0],
'var1': [1,2,3,7,8,9,10, 15, 20, 30]
}
# Create DataFrame
df2 = pd.DataFrame(data)
df2
group flag var1
0 A 1 1
1 A 0 2
2 A 0 3
3 B 1 7
4 B 0 8
5 B 0 9
6 B 0 10
7 B 1 15
8 B 0 20
9 B 0 30
Step 1: CReate columns in df2(with suffix "_new") based on unique "val" in df1 like below:
unique_val = df1['val'].unique().tolist()
new_cols = [t + '_new' for t in unique_val]
for i in new_cols:
df2[i] = 0
df2
group flag var1 AA_new AB_new B1_new B2_new
0 A 1 1 0 0 0 0
1 A 0 2 0 0 0 0
2 A 0 3 0 0 0 0
3 B 1 7 0 0 0 0
4 B 0 8 0 0 0 0
5 B 0 9 0 0 0 0
6 B 0 10 0 0 0 0
7 B 1 15 0 0 0 0
8 B 0 20 0 0 0 0
9 B 0 30 0 0 0 0
Step 2: for row where flag = 1, AA_new will be calculated as var1(from df2)* value of 'cal1' from df1 for group "A" and val "AA" * value of 'cal2' from df1 for group "A" and val "AA", similarly AB_new will be calculated as var1(from df2) * value of 'cal1' from df1 for group "A" and val "AB" * value of 'cal2' from df1 for group "A" and val "AB"
My expected output should look like below:
group flag var1 AA_new AB_new B1_new B2_new
0 A 1 1 40.0 500.0 0.0 0.0
1 A 0 2 0.0 0.0 0.0 0.0
2 A 0 3 0.0 0.0 0.0 0.0
3 B 1 7 0.0 0.0 4900.0 420.0
4 B 0 8 0.0 0.0 0.0 0.0
5 B 0 9 0.0 0.0 0.0 0.0
6 B 0 10 0.0 0.0 0.0 0.0
7 B 1 15 0.0 0.0 10500.0 900.0
8 B 0 20 0.0 0.0 0.0 0.0
9 B 0 30 0.0 0.0 0.0 0.0
Below solution based on the other stackflow question works partially:
df2.assign(**df1.assign(mul_cal = df1['cal1'].mul(df1['cal2']))
.pivot_table(columns='val',
values='mul_cal',
index = ['group', df2.index])
.add_suffix('_new')
.groupby(level=0)
.apply(lambda x: x.bfill().ffill())
.reset_index(level='group',drop='group')
.fillna(0)
.mul(df2['var1'], axis=0)
.where(df2['flag'].eq(1), 0)
)
Upvotes: 0
Views: 96
Reputation: 682
Flexible Columns
If you want this works when we add several rows more in df1, you can do this.
combinations = df1.groupby(['group','val'])['cal3'].sum().reset_index()
for index_, row_ in combinations.iterrows():
for index, row in df2.iterrows():
if row['flag'] == 1:
if row['group'] == row_['group']:
df2.loc[index, row_['val'] + '_new'] = row['var1'] * df1[(df1['group'] == row_['group']) & (df1['val'] == row_['val'])]['cal3'].values[0]
Hard Code
You can use iteration to dataframe and change its specific column in each iteration, you can do something like this (but you need to add new column into your df1
first).
df1['cal3'] = df1['cal1'] * df1['cal2']
for index, row in df2.iterrows():
if row['flag'] == 1:
if row['group'] == 'A':
df2.loc[index, 'AA_new'] = row['var1'] * df1[(df1['group'] == 'A') & (df1['val'] == 'AA')]['cal3'].values[0]
df2.loc[index, 'AB_new'] = row['var1'] * df1[(df1['group'] == 'A') & (df1['val'] == 'AB')]['cal3'].values[0]
elif row['group'] == 'B':
df2.loc[index, 'B1_new'] = row['var1'] * df1[(df1['group'] == 'B') & (df1['val'] == 'B1')]['cal3'].values[0]
df2.loc[index, 'B2_new'] = row['var1'] * df1[(df1['group'] == 'B') & (df1['val'] == 'B2')]['cal3'].values[0]
This is the result I got.
Upvotes: 1