tink3r
tink3r

Reputation: 177

How do I retain the column name used in my group by with Pandas

I have two data frames. I would like to use group by on the second data frame and then merge the two together on the Company Name column. The issue is that with my group by statement I loose the Company Name column.

import pandas as pd

df1 = pd.DataFrame(
    {
        'Company Name': ['Google','Google','Microsoft','Microsoft','Amazon','Amazon'],
        'Location': ['Somewhere','Somewhere','Somewhere','Somewhere','Somewhere','Somewhere'],
    }
)

df = pd.DataFrame(
    {
        'Company Name': ['Google','Google','Microsoft','Microsoft','Amazon','Amazon'],
        'Sales': [12345,12345,12345,12345,12345,12345],
        'Company Type': ['Software','Software','Software','Software','Software','Software']
    }
)
df = df.groupby(['Company Name']).sum()

pd.merge(df1,df,how="inner",on="Company Name")

I get an error message when merging due to df not having a Company Name column to perform the join.

Upvotes: 1

Views: 758

Answers (1)

U13-Forward
U13-Forward

Reputation: 71580

Replace this line:

df = df.groupby(['Company Name']).sum()

With:

df = df.groupby('Company Name', as_index=False).sum()

Then your code will work as expected, and return:

  Company Name   Location  Sales
0       Google  Somewhere  24690
1       Google  Somewhere  24690
2    Microsoft  Somewhere  24690
3    Microsoft  Somewhere  24690
4       Amazon  Somewhere  24690
5       Amazon  Somewhere  24690

Upvotes: 2

Related Questions