Reputation: 359
Is there an easy way to do the following: I have a panda dataframe with the following columns: date (by day), company, sales. I want to create a dataframe with the sales for each company as a column; i.e., the new columns should be: date, company 1 sales, company 2 sales, company 3 sales. I'd like to name the columns by the name in the company column.
I did: df.groupby(['company'])
and thought I could an outer join for each element in the group. But this seems unwieldly? It seems like there should be an easier way to do this? Thank you! Any help is greatly appreciated.
An example is:
date | company | sales
1/1/2017 | AAA | 100
1/1/2017 | BBB | 200
1/1/2017 | CCC | 300
And I want to convert it to:
date | AAA | BBB | CCC
1/1/2017 | 100 | 200 | 300
Upvotes: 1
Views: 108
Reputation: 294328
option 1
set_index
+ unstack
df.set_index(['date', 'company']).sales.unstack().reset_index()
company date AAA BBB CCC
0 1/1/2017 100 200 300
option 2
pivot
df.pivot('date', 'company', 'sales').reset_index()
company date AAA BBB CCC
0 1/1/2017 100 200 300
Upvotes: 1