Harvs
Harvs

Reputation: 533

Quantile-Quantile Plot using Seaborn and SciPy

Can anyone give me a way to do a qq plot in Seaborn as a test for normality of data? Or failing that, at least in matplotlib.

Thanks in advance

Upvotes: 10

Views: 38327

Answers (4)

Uwe Schweinsberg
Uwe Schweinsberg

Reputation: 48

At seaborn-qqplot addon documentation an example is shown. Also see.

Working with pycharm and windows 10 I had difficulties installing the library with:

pip install seaborn-qqplot

in my virtual environment. The import line:

from seaborn_qqplot import pplot

was not recognized.

With (commands for PyCharm): file -> settings -> Project -> Python Interpreter -> + (Install) I could import pplot from seaborn_qqplot and could create a Quantile - Quantile plot.

Upvotes: 0

leonkato
leonkato

Reputation: 196

Try statsmodels.api.qqplot().

Using same data as above, this example shows a normal distribution plotted against a normal distribution, resulting in fairly straight line:

import numpy as np
import matplotlib.pyplot as plt
import statsmodels.api as sm

a = np.random.normal(5, 5, 250)
sm.qqplot(a)
plt.show()

qq normal

This example shows a Rayleigh distribution plotted against normal distribution, resulting in a slightly concave curve:

a = np.random.rayleigh(5, 250)
sm.qqplot(a)
plt.show()

qq rayleigh

Upvotes: 12

Ingo
Ingo

Reputation: 1263

I'm not sure if this still recent, but I notice that neither of the answers really addresses the question, which asks how to do qq-plots with scipy and seaborn, but doesn't mention statsmodels. In fact, qq-plots are available in scipy under the name probplot:

from scipy import stats
import seaborn as sns
stats.probplot(x, plot=sns.mpl.pyplot)

The plot argument to probplot can be anything that has a plot method and a text method. Probplot is also quite flexible about the kinds of theoretical distributions it supports.

Upvotes: 9

ImportanceOfBeingErnest
ImportanceOfBeingErnest

Reputation: 339795

After reading the wikipedia article, I understand that the Q-Q plot is a plot of the quantiles of two distributions against each other.

numpy.percentile allows to obtain the percentile of a distribution. Hence you can call numpy.percentile on each of the distributions and plot the results against each other.

import numpy as np
import matplotlib.pyplot as plt

a = np.random.normal(5,5,250)
b = np.random.rayleigh(5,250)

percs = np.linspace(0,100,21)
qn_a = np.percentile(a, percs)
qn_b = np.percentile(b, percs)

plt.plot(qn_a,qn_b, ls="", marker="o")

x = np.linspace(np.min((qn_a.min(),qn_b.min())), np.max((qn_a.max(),qn_b.max())))
plt.plot(x,x, color="k", ls="--")

plt.show()

enter image description here

Upvotes: 22

Related Questions