Chris
Chris

Reputation: 31

What is the height of the histograms in an pairplot in seaborn?

I have a question regarding the y-axis of the histograms, which are generated in an default pairplot with seaborn.

Here is some example code:

import pandas as pd
import seaborn as sns
import numpy as np

data = [np.random.random_sample(20), np.random.random_sample(20)]
dataFrame = pd.DataFrame(data=zip(*data))
g = sns.pairplot(dataFrame)
g.savefig("test.png", dpi=100)

What is the unit of the y-axis in the diagonal placed histograms? How can I read the height of a bin in this view?

Thank you very much,
Chris

Upvotes: 3

Views: 2252

Answers (1)

Diziet Asahi
Diziet Asahi

Reputation: 40737

by default, pairplot uses the diagonal to "show the univariate distribution of the data for the variable in that column" (http://stanford.edu/~mwaskom/software/seaborn/generated/seaborn.pairplot.html).

So each bar represent the count of values in the corresponding bin (that you can get from the X axis). The Y axis, however, does not correspond to the actual count, but corresponds to the scatterplot instead.

I could not get the data from the PairPlot itself, but if you don't say otherwise, seaborn uses plt.hist() to generate that diagonal, so you could get the data using:

import matplotlib.pyplot as plt
%matplotlib inline
import pandas as pd
import seaborn as sns
import numpy as np

data = [np.random.random_sample(20), np.random.random_sample(20)]
dataFrame = pd.DataFrame(data=zip(*data))
g = sns.pairplot(dataFrame)

enter image description here

# for the first variable:
c, b, p = plt.hist(dataFrame.iloc[:,0])
print c
# [ 3.  6.  0.  2.  3.  0.  1.  3.  1.  1.] 

enter image description here

Upvotes: 6

Related Questions