Reputation: 177
I have two data frames. I would like to use group by on the second data frame and then merge the two together on the Company Name column. The issue is that with my group by statement I loose the Company Name column.
import pandas as pd
df1 = pd.DataFrame(
{
'Company Name': ['Google','Google','Microsoft','Microsoft','Amazon','Amazon'],
'Location': ['Somewhere','Somewhere','Somewhere','Somewhere','Somewhere','Somewhere'],
}
)
df = pd.DataFrame(
{
'Company Name': ['Google','Google','Microsoft','Microsoft','Amazon','Amazon'],
'Sales': [12345,12345,12345,12345,12345,12345],
'Company Type': ['Software','Software','Software','Software','Software','Software']
}
)
df = df.groupby(['Company Name']).sum()
pd.merge(df1,df,how="inner",on="Company Name")
I get an error message when merging due to df not having a Company Name column to perform the join.
Upvotes: 1
Views: 758
Reputation: 71580
Replace this line:
df = df.groupby(['Company Name']).sum()
With:
df = df.groupby('Company Name', as_index=False).sum()
Then your code will work as expected, and return:
Company Name Location Sales
0 Google Somewhere 24690
1 Google Somewhere 24690
2 Microsoft Somewhere 24690
3 Microsoft Somewhere 24690
4 Amazon Somewhere 24690
5 Amazon Somewhere 24690
Upvotes: 2