user1008537
user1008537

Reputation:

Plotting a stacked histogram with Pandas with Group By

I am working with a dataset that looks as follows:

Gender, Height, Width Male, 23.4, 4.4 Female, 45.4, 4.5

I'd like to visualize the stacked histograms of height and width. I'm hoping to have two stacked histograms per plot (one for each gender).

This is the stacked Histogram from the documentation. If there was three genders, this might be a good graph for width.

enter image description here

I hope you understand what I mean, I've been slamming my head at this for hours.

Upvotes: 5

Views: 8961

Answers (1)

user2285236
user2285236

Reputation:

Your example from pandas documentation has three seperate columns in a dataframe and df.hist() generates three different histograms for those three columns. Your data structure is a little different. If you'd like to use matplotlib directly, you can try this:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
np.random.seed(10)
df = pd.DataFrame({"Gender":np.random.choice(["Female", "Male"], 1000), 
                "Height": 30+np.random.randn(1000)*5,
                "Width": 5+np.random.randn(1000)})
df.loc[df["Gender"]=="Male", "Height"] = df.loc[df["Gender"]=="Male", "Height"] + 8

plt.hist(df[df["Gender"]=="Male"]["Height"].reset_index(drop=True), alpha=0.6, label="Male")
plt.hist(df[df["Gender"]=="Female"]["Height"].reset_index(drop=True), alpha=0.6, label="Female")
plt.legend()
plt.show()

This will create a histogram like this:

enter image description here

Upvotes: 8

Related Questions