user184074
user184074

Reputation: 93

How can I display multiple pandas series histograms in the same jupyter notebook cell?

I have a dataset with an obvious stratification and I'm looking for graphical evidence that there is a difference in their histograms. Suppose for here that my data set looks something like

id | cat | hour
---------------
1  | a   | 14
5  | c   | 9

If I try and plot each of the histograms for a fixed categorical variable, then I get overlapping graphs. For example, if I write

unique_cats = list(df["cat"].unique())
for cat in unique_cats:
    df[df["cat"] == cat]["hour"].hist(bins=24, rwidth=0.9,
                                      normed=True, alpha=0.3)

then I get a bunch of overlapping histograms. Here's a screenshot:

overlapping histograms

How can I have my histograms have their own separate line in my jupyter notebook?

Upvotes: 5

Views: 2956

Answers (1)

ImportanceOfBeingErnest
ImportanceOfBeingErnest

Reputation: 339660

You might want to create a new figure (plt.figure()) for each category:

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
%matplotlib inline

df = pd.DataFrame({"cat": np.random.choice(list("ABC"), size=100),
                  "hour" : np.random.rand(100)})
unique_cats = list(df["cat"].unique())
for cat in unique_cats:
    plt.figure()
    df[df["cat"] == cat]["hour"].hist(bins=24, rwidth=0.9,
                                      normed=True, alpha=0.3)

enter image description here

Upvotes: 6

Related Questions