Reputation: 7576
I'm trying to merge 2 dataframes. I'm using the Jupyter notebook and pandas dataframes. My two dfs look like this:
gbdf.dtypes:
product_name object
Quantity float64
Product_id int64
product_group1 int64
product_group1_name object
product_group2 int64
product_group2_name object
packing_unit object
packing_amount int64
dtype: object
trns.dtypes:
Store_id int64
Date object
Price int64
Net price int64
Purchase price int64
Hour int64
product_id int64
Quantity int64
dtype: object
Yet, when I try to run
gbdfprice = gbdf.merge(gbdf, trns, left_on = 'Product_id', right_on = 'product_id')
I get
KeyError: 'product_id'
Any idea why?
Upvotes: 4
Views: 10302
Reputation: 1123
The format you have used (that accepts left and right DataFrame
arguments) is the method associated with the pandas top-level module, however you have actually used the method associated with a DataFrame
object which accepts only the right argument.
import pandas as pd
left = DataFrame(...)
right = DataFrame(...)
#Method you have used
combined = left.merge(right, [options...])
#Method you have taken argument list from
combined = pd.merge(left, right, [options...])
From what I can see in the source, left.merge(right...)
just imports the other merge
method and runs merge(self,right,...)
.
So, as @ayhan points out, to fix just remove gbdf
from the argument list, or you could also replace the gbdf.merge
call with pd.merge
and leave the argument list the same.
Upvotes: 1