Aleksandr Samarin
Aleksandr Samarin

Reputation: 275

Seaborn distplot: fit distribution with some fixed parameres

I want to fit some distribution, say gamma, to a given data array x, and plot the corresponding density function. I can make this easily via seaborn.distplot and scipy.stats:

sns.distplot(x, fit = stats.gamma)

However, let's say that I want some parameters of this distribution to remain fixed, for example loc. When I'm using fit function from scipy.stats with fixed loc, I write it as

stats.gamma.fit(x, floc = 0)

Is there a way to pass loc=0 to fit in distplot function and achieve the same result?

Upvotes: 2

Views: 3605

Answers (3)

hojjat KAVEH
hojjat KAVEH

Reputation: 11

This is a good question. To find the Gamma distribution approximation of a given data set, you can use liklihood function. This is the link for the formulation. https://en.wikipedia.org/wiki/Gamma_distribution Here I have generated some random numbers based on a given gamma parameters. Then I have used the data and maximum liklihood to find the parameters of the distribution.

import numpy as np
from scipy.special import psi
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.optimize import fsolve

# Generating Random variable based on Gamma dist:

shape, scale = 3., 5.  # 
s = np.random.gamma(shape, scale, 1000)

plt.hist(s, 50, density=True)


# Solving Liklihood:
Log_s=np.log(s)
Mean_s=np.mean(s)
Size_s=np.size(s)

k=fsolve(lambda k: np.log(k)-psi(k)-np.log(Mean_s)+(1/Size_s)*np.sum(Log_s),.1)
theta=Mean_s/k

sns.distplot(np.random.gamma(k, theta, 1000), hist=False, label='Gamma')

plt.show()

Upvotes: 1

Iain D
Iain D

Reputation: 507

The simplest way to do this is to not plot the fit using distplot, but using the approach described in this post. Simple example provided:

import matplotlib.pyplot as plt
import pandas as pd
from scipy import stats
import numpy as np

df = pd.DataFrame(np.random.gamma(2, scale=2, size=5000),
                  columns=['samples'])

params = stats.gamma.fit(df.samples, loc=0)
xvals = np.linspace(0, df.samples.max())
pdf = lambda x: stats.gamma.pdf(xvals, *params)
yvals = pdf(xvals)

fig, ax1 = plt.subplots()

df.samples.hist(bins=20, ax=ax1, normed=True, label='Samples',
                grid=False, edgecolor='k')

plt.plot(xvals, yvals, axes=ax1, c='r', label='Fit')
ax1.legend()

This will result in something like... not enough rep to embed

Upvotes: 0

ImportanceOfBeingErnest
ImportanceOfBeingErnest

Reputation: 339052

Under the condition that sns.distplot(x, fit = stats.gamma) will indeed show a reasonable plot and that stats.gamma.fit(x, loc = 0) would give the desired statistics, you can supply the argument via the fit_kws:

sns.distplot(x, fit = stats.gamma, fit_kws={"loc" : 0})

[This answer is based on reading the documentation and is untested, because no usecase was given in the question.]

Upvotes: 0

Related Questions