Reputation: 93
I have a dataset with an obvious stratification and I'm looking for graphical evidence that there is a difference in their histograms. Suppose for here that my data set looks something like
id | cat | hour
---------------
1 | a | 14
5 | c | 9
If I try and plot each of the histograms for a fixed categorical variable, then I get overlapping graphs. For example, if I write
unique_cats = list(df["cat"].unique())
for cat in unique_cats:
df[df["cat"] == cat]["hour"].hist(bins=24, rwidth=0.9,
normed=True, alpha=0.3)
then I get a bunch of overlapping histograms. Here's a screenshot:
How can I have my histograms have their own separate line in my jupyter notebook?
Upvotes: 5
Views: 2956
Reputation: 339660
You might want to create a new figure (plt.figure()
) for each category:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
%matplotlib inline
df = pd.DataFrame({"cat": np.random.choice(list("ABC"), size=100),
"hour" : np.random.rand(100)})
unique_cats = list(df["cat"].unique())
for cat in unique_cats:
plt.figure()
df[df["cat"] == cat]["hour"].hist(bins=24, rwidth=0.9,
normed=True, alpha=0.3)
Upvotes: 6