Jiten
Jiten

Reputation: 153

Stacked bar using group by in Python dataframe

I am trying to create a stacked bar graph that replicates the image, I have read my data from csv and trying to do group by and show stacked bar but not getting desired output.

I did group by of the data like this:

modified_df1 = modified_df.groupby(["business_postal_code","risk_category"]).size().reset_index(name='counts')
modified_df1 = modified_df.loc[modified_df['counts'] > 1100]

After group by and filter, data looks like this:

    business_postal_code    risk_category   counts
20  94102.0                 Low Risk        1334
22  94102.0                 UnKnown         1106
24  94103.0                 Low Risk        1472
25  94103.0                 Moderate Risk   1474
26  94103.0                 UnKnown         1329
44  94109.0                 Low Risk        1415
48  94110.0                 Low Risk        2189
49  94110.0                 Moderate Risk   1731
50  94110.0                 UnKnown         1331
117 94133.0                 Low Risk        1412

Then did the stack bar:

df2 = modified_df1.groupby(['business_postal_code','risk_category'])['business_postal_code'].count().unstack('risk_category')
df2[['Moderate Risk','Low Risk']].plot(kind='bar', stacked=True)

Current output

Desired output

Please suggest, how to achieve desired output. Issue is, i have to group by the data by 2 columns and then have to apply filter(if counts > 1100) and print the stack bar.

Upvotes: 1

Views: 5485

Answers (2)

Jiten
Jiten

Reputation: 153

using sum() instead of count() with group by will also give the expected output.

df2 = modified_df1.groupby(['business_postal_code','risk_category'])['counts'].sum().unstack('risk_category')

df2[['Moderate Risk','Low Risk','High Risk','SAFE']].plot(kind='bar', stacked=True, figsize=(12,8))

But, approach suggested by Nk03 also works and more cleaner approach.

Upvotes: 0

Nk03
Nk03

Reputation: 14949

IIUC, you can try:

df.pivot(*df).plot(kind = 'bar', stacked = True)

OR:

df.pivot_table(index = 'business_postal_code', columns = 'risk_category' , values = 'counts').plot(kind = 'bar', stacked = True)

OUTPUT:

enter image description here

Complete Example:

df = pd.DataFrame({'business_postal_code': {20: 94102.0,
  22: 94102.0,
  24: 94103.0,
  25: 94103.0,
  26: 94103.0,
  44: 94109.0,
  48: 94110.0,
  49: 94110.0,
  50: 94110.0,
  117: 94133.0},
 'risk_category': {20: 'Low Risk',
  22: 'UnKnown',
  24: 'Low Risk',
  25: 'Moderate Risk',
  26: 'UnKnown',
  44: 'Low Risk',
  48: 'Low Risk',
  49: 'Moderate Risk',
  50: 'UnKnown',
  117: 'Low Risk'},
 'counts': {20: 1334,
  22: 1106,
  24: 1472,
  25: 1474,
  26: 1329,
  44: 1415,
  48: 2189,
  49: 1731,
  50: 1331,
  117: 1412}})
df.pivot(*df).plot(kind = 'bar', stacked = True)

Upvotes: 2

Related Questions