Reputation: 23
I'm new to pandas. Given two data frames:
df_1
product_id | product_price | invoice_total |
---|---|---|
p1 | 100 | 200 |
p2 | 200 | 300 |
p3 | 300 | 600 |
p4 | 400 | 700 |
df_2
product_id | quantity | invoice_total |
---|---|---|
p1 | 8 | 700 |
p6 | 3 | 900 |
p2 | 5 | 600 |
I want to check if the product id in df1 is similar to that of df2 and if so, pick the value of invoice total in df2.
I've tried the for loop:
df_new = pd.DataFrame()
for i in df1.product_id:
for j in df2.product_id:
if i == j:
# return the value of df2.invoice_total and append to the df_new.
But I believe there's a better way.
The result should be something like this:
product_id | invoice_total |
---|---|
p1 | 700 |
p2 | 600 |
Upvotes: 2
Views: 972
Reputation: 2079
You can use a dataframe merge for this
import pandas as pd
df_1 = pd.DataFrame({
'product_id': ['p1', 'p2', 'p3', 'p4'],
'product_price': [100, 200, 300, 400],
'invoice_total': [200, 300, 600, 700]
})
df_2 = pd.DataFrame({
'product_id': ['p1', 'p6', 'p2'],
'quantity': [8, 3, 5],
'invoice_total': [700, 900, 600]
})
df_merged = df_1.merge(
df_2,
on='product_id',
suffixes=('_df1', '')
)
Contents of df_merged
product_id product_price invoice_total_df1 quantity invoice_total
0 p1 100 200 8 700
1 p2 200 300 5 600
Then filter to only the columns you need
df_merged = df_merged[['product_id', 'invoice_total']]
Final contents of df_merged
product_id invoice_total
0 p1 700
1 p2 600
Upvotes: 1
Reputation: 1
I am assuming you want to do a "left join".
df_new = df1.join(df2, how = 'left')
then in df_new you can select the 'invoice_total' column and make a list of it by:
df_new['invoice_total'].values.tolist()
let me know if this is what you were looking for ;).
Upvotes: 0