Reputation: 111
I have two dataframe like this:
DF1
product_name cost Account Acc_no
prod1 200 NA-STG 1264509
prod2 500 NA-STG 1264509
prod3 800 NA-STG 1264509
prod5 300 NA-STG 1264509
prod3 804 NA-DEV 1264345
prod4 511 NA-DEV 1264345
prod5 308 NA-DEV 1264345
df2:
product_name active_subscription distributed_cost account_alias
prod1 20 1299.498484 NA-STG
prod2 50 3248.746211 NA-STG
prod3 80 5197.993937 NA-STG
prod4 80 97.993937 NA-STG
prod4 32 1000 NA-DEV
I want add the distributed_cost from df2 into df1 based account.If any product in df2(ex:prod4 is missing in df1) is missing in df1 then corresponding records has to be added in df1.So the output I want is:
product_name cost Account
prod1 1499.498484 NA-STG
prod2 3748.746211 NA-STG
prod3 5997.993937 NA-STG
prod4 97.993937 NA-STG
prod5 300 NA-STG
prod3 804 NA-DEV
prod4 1511 NA-DEV
prod5 308 NA-DEV
How can I implement this in efficient way?
Upvotes: 2
Views: 75
Reputation: 13821
You can use pd.concat()
with a groupby
to get your result, as below:
pd.concat([
df1.drop('Acc_no',axis=1),
(df2.drop('active_subscription',axis=1)).rename({'distributed_cost':'cost','account_alias':'Account'},axis=1)]) \
.groupby(['product_name','Account'])['cost'].sum() \
.reset_index() \
.sort_values(['Account','product_name'],ascending=[False,True]
)
which prints:
product_name Account cost
0 prod1 NA-STG 1499.498484
1 prod2 NA-STG 3748.746211
3 prod3 NA-STG 5997.993937
5 prod4 NA-STG 97.993937
7 prod5 NA-STG 300.000000
2 prod3 NA-DEV 804.000000
4 prod4 NA-DEV 1511.000000
6 prod5 NA-DEV 308.000000
Upvotes: 1
Reputation: 1704
Do below steps to achieve it:
merge
on your dataframes using product_names and account respective columns in both DFs.Code:
df = df1.merge(df2, right_on=['product_name','account_alias'], left_on=['product_name','Account'], how='outer')
df['Account'] = df['Account'].fillna(df['account_alias'])
df['cost'] = df['cost'].fillna(0) + df['distributed_cost'].fillna(0)
df = df.loc[:, ['product_name','cost', 'Account']]
Output:
>>> df.sort_values(['Account', 'product_name'], ascending=[False,True])
product_name cost Account
0 prod1 1499.498484 NA-STG
1 prod2 3748.746211 NA-STG
2 prod3 5997.993937 NA-STG
7 prod4 97.993937 NA-STG
3 prod5 300.000000 NA-STG
4 prod3 804.000000 NA-DEV
5 prod4 1511.000000 NA-DEV
6 prod5 308.000000 NA-DEV
Upvotes: 1