Reputation: 340
I have following dataframe
Categories Owner
Product_Query products_team
Product_Query products_team
Infrastructure_Query infra_team
Bug_Query development_team
Bug_Query development_team
I want to display the above data in the following format
Categories Total products_team infra_team development_team
Product_Query 2 2 0 0
Infrastructure_Query 1 0 1 0
Bug_Query 2 0 0 2
I am doing the following
df["count"]=1
df_pivot=df.pivot_table(index=["Catagories"],columns=["Owner"],values=["count"],aggfunc=[np.sum],fill_value=0)
But I am not getting the output in the correct format.How can I get the output in the format mentioned above?
Upvotes: 1
Views: 510
Reputation: 153460
pd.crosstab
:pd.crosstab(df['Categories'], df['Owner'], margins=True, margins_name='Total')
Output:
Owner development_team infra_team products_team Total
Categories
Bug_Query 2 0 0 2
Infrastructure_Query 0 1 0 1
Product_Query 0 0 2 2
Total 2 1 2 5
And, if you don't want the row Total you can drop it:
df_out = pd.crosstab(df['Categories'], df['Owner'], margins=True, margins_name='Total')
df_out = df_out.drop('Total')
df_out
Output:
Owner development_team infra_team products_team Total
Categories
Bug_Query 2 0 0 2
Infrastructure_Query 0 1 0 1
Product_Query 0 0 2 2
pivot_table
:df.assign(counts=1)\
.pivot_table('counts','Categories','Owner','sum',
fill_value=0, margins=True, margins_name='Total')\
.drop('Total')
Output:
Owner development_team infra_team products_team Total
Categories
Bug_Query 2 0 0 2
Infrastructure_Query 0 1 0 1
Product_Query 0 0 2 2
groupby
:df_out = df.groupby(['Categories', 'Owner'])['Owner'].count().unstack(fill_value=0)
df_out['Total'] = df_out.sum(1)
Output:
Owner development_team infra_team products_team Total
Categories
Bug_Query 2 0 0 2
Infrastructure_Query 0 1 0 1
Product_Query 0 0 2 2
Upvotes: 3