MogaGennis
MogaGennis

Reputation: 107

Plot overlapping histogram plot in seaborn

I want to plot two histplots that overlap nicely in seaborn, this is my code:

sns.histplot(data=data[feature_1], color='r')
sns.histplot(data=data[data.feature_2 == 1][feature_1], color='g')

It gives the result I want for 2 possible values: enter image description here

But for multiple values it doesn't fit nicely: enter image description here

Any thoughts or advise to make this look nice?

Upvotes: 0

Views: 980

Answers (1)

mwaskom
mwaskom

Reputation: 49032

The problem in your second plot is that the histogram is defining some number of bins (probably about 10; it depends on the size and variance of the dataset) between the minimum and maximum value in the data. That is not ideal when you have a small number of integer values. If you add discrete=True it will define bins centered on every number between (inclusive) the minimum and maximum value, producing a similar effect to what you get for a histogram over categorical data.

With categorical data it is easy to infer that you want bins to correspond to unique data values. It is more difficult when data are numeric. You might think it makes sense if the dtype is integer, but obvious counterexamples come to mind. Imagine a histogram of ages in a population, or even worse, salaries. Unit-width bins would be too small in those cases. So discrete binning is strictly opt-in for numeric data.

Upvotes: 2

Related Questions