Reputation: 7231
I have an input dataset (DataFrame / numpy matrix) that has a skewed normal distribution. I am trying to find the python transformation function (or numpy matrix) which will transform the input dataset to a normal distribution with no skew.
I have looked at curve_fit (in scipy.optimize) and am not sure how I would go about applying it.
Is there a simple method of doing this?
Upvotes: 2
Views: 1357
Reputation: 294228
I've done one of 2 things:
Example
from scipy.stats import norm
df = pd.DataFrame(np.random.rand(1000), columns=['Uniform'])
df['Normal'] = norm.ppf((df.Uniform.rank() - .5) / len(df))
df.plot(kind='kde')
df.skew()
Uniform 2.392991e-02
Normal 2.114051e-15
dtype: float64
Upvotes: 1