Reputation: 575
I've got files with irradiance data measured every minute 24 hours a day. So if there is a day without any clouds on the sky the data shows a nice continuous bell curves. When looking for a day without any clouds in the data I always plotted month after month with gnuplot and checked for nice bell curves.
I was wondering If there's a python way to check, if the Irradiance measurements form a continuos bell curve. Don't know if the question is too vague but I'm simply looking for some ideas on that quest :-)
Upvotes: 1
Views: 3243
Reputation: 2038
Just to complement the given answer with a code example: one can use a Kolmogorov-Smirnov test to obtain a measure for the "distance" between two distributions. SciPy offers a neat interface for this, called kstest
:
from scipy import stats
import numpy as np
data = np.random.normal(size=100) # Our (synthetic) dataset
D, p = stats.kstest(data, "norm") # Perform a one-sided Kolmogorov-Smirnov test
In the above example, D
denotes the distance between our data
and a Gaussian normal (norm
) distribution (smaller is better), and p
denotes the corresponding p-value. Other distributions can be similarly tested by substituting norm
with those implemented in scipy.stats
.
Upvotes: 1
Reputation: 1218
For a normal distribution, there are normality tests.
In short, we abuse some knowledge we have of what normal distributions look like to identify them.
The kurtosis of any normal distribution is 3. Compute the kurtosis of your data and it should be close to 3.
The skewness of a normal distribution is zero, so your data should have a skewness close to zero
More generally, you could compute a reference distribution and use a Bregman Divergence, to assess the difference (divergence) between the distributions. bin your data, create a histogram, and start with Jensen-Shannon divergence.
With the divergence approach, you can compare to an arbitrary distribution. You might record a thousand sunny days and check if the divergence between the sunny day and your measured day is below some threshold.
Upvotes: 2