dublin123
dublin123

Reputation: 43

How to build a population pyramid with pandas dataframe

How to plot a population pyramid based on the following starting dataframe?

           Age  Gender  Count
0  50-45 years    male      4
1  50-45 years  female      5
2  55-65 years    male      6
3  55-65 years  female      7
4  65-70 years    male     11
5  65-70 years  female     12

I tried the following, Population Pyramid with Python and Seaborn, but the resulting plot looks strange:

import pnadas as pd
import seaborn as sns

# data
data = {'Age': ['50-45 years', '50-45 years', '55-65 years', '55-65 years', '65-70 years', '65-70 years'],
        'Gender': ['male', 'female', 'male', 'female', 'male', 'female'], 'Count': [4, 5, 6, 7, 11, 12]}

df = pd.DataFrame(data)

# plot
sns.barplot(data=df, x='Count', y='Age',
            hue='Gender', orient='horizontal', 
            dodge=False)

I think the problem is that my age is a string.

enter image description here

Upvotes: 1

Views: 1321

Answers (1)

Trenton McKinney
Trenton McKinney

Reputation: 62403

  • Unlike in the linked question, 'Count' for both 'Gender' groups is positive, so with dodge=False, the 'Female' bars are drawn on top of the 'Male' bars.
  • Convert one of the groups to negative values, using .loc and Boolean selection.
# convert male counts to negative
df.loc[df.Gender.eq('male'), 'Count'] = df.Count.mul(-1)

# plot
sns.barplot(data=df, x='Count', y='Age', hue='Gender', orient='horizontal', dodge=False)

enter image description here

Upvotes: 3

Related Questions