Reputation: 7644
i have a table in pandas df
id product_1 product_2 count
1 100 200 10
2 200 600 20
3 100 500 30
4 400 100 40
5 500 700 50
6 200 500 60
7 100 400 70
also i have another table in dataframe df2
product price
100 5
200 10
300 15
400 20
500 25
600 30
700 35
i wanted to merge df2
with df1
such that i get price_x and price_y as columns
and then again divide price_y/price_x
to get final column as perc_diff
.
so i tried to do merge using.
# Add prices for products 1 and 2
df3 = (df1.
merge(df2, left_on='product_1', right_on='product').
merge(df2, left_on='product_2', right_on='product'))
# Calculate the percent difference
df3['perc_diff'] = (df3.price_y - df3.price_x) / df3.price_x
But when i did merge i got multiple columns of product_1
and product_2
for eg. my df3.head(1)
after merging is:
id product_1 product_2 count product_1 product_2 price_x price_y
1 100 200 10 100 200 5 10
So how do i remove these multiple column's of product_1
& product_2
while merging or after merging?
Upvotes: 2
Views: 4961
Reputation: 863611
For removing column is necessary rename
:
df3 = df1.merge(df2, left_on='product_1', right_on='product') \
.merge(df2.rename(columns={'product':'product_2'}), on='product_2')
#borrow from piRSquared solution
df3 = df3.assign(perc_diff=df3.price_y.div(df3.price_x).sub(1))
print (df3)
id product_1 product_2 count product price_x price_y perc_diff
0 1 100 200 10 100 5 10 1.00
1 3 100 500 30 100 5 25 4.00
2 6 200 500 60 200 10 25 1.50
3 7 100 400 70 100 5 20 3.00
4 2 200 600 20 200 10 30 2.00
5 4 400 100 40 400 20 5 -0.75
6 5 500 700 50 500 25 35 0.40
Upvotes: 2
Reputation: 294516
df2_ = df2.set_index('product')
df3 = df.join(df2_, on='product_1') \
.join(df2_, on='product_2', lsuffix='_x', rsuffix='_y')
df3.assign(perc_diff=df3.price_y.div(df3.price_x).sub(1))
Upvotes: 4