Reputation: 10041
I have groupby state
, value counts industry
of a dataframe.
df.loc[df['state'].isin(['Alabama','Arizona'])].groupby(df['state'])['industry'].value_counts(sort = True)
Out:
state industry
Alabama Financial Services 224
Education 7
Healthcare, Pharmaceuticals, & Biotech 5
Business Services 2
Other 2
Retail 2
Government 1
Manufacturing 1
Transportation & Storage 1
Arizona Healthcare, Pharmaceuticals, & Biotech 19
Other 13
Education 5
Retail 5
Transportation & Storage 5
Manufacturing 4
Travel, Recreation, and Leisure 4
Consumer Services 3
Energy & Utilities 2
Financial Services 2
Government 2
Business Services 1
Computers & Electronics 1
Software & Internet 1
Name: industry, dtype: int64
Now I would like to go further, get percentage of value counts, for example, for Alabama
, I want to know the percentage of Financial Services
, which is calculated by 224/ (224 + 7 + ... + 1)
, etc.
How could I do that by using new code or modify the code above? Thanks.
Upvotes: 1
Views: 658
Reputation: 323306
Adding normalize
df.loc[df['state'].isin(['Alabama','Arizona'])].groupby(df['state'])['industry'].value_counts(sort = True, normalize=True)
Upvotes: 2