Reputation: 328
I'm trying to superimpose a normal distribution on some data. I have binned and plotted the data, and I want to generate a normal distribution for comparison. I'm using jStat for this.
So far I have been able to generate the normal distribution, but I can't figure out how to 'scale' it up to be the same size as the actual data.
The normal distribution data is orders of magnitude smaller than the actual data and shows up nearly flat on the graph.
Here is what I mean:
Here's a plot of the black line with the blue turned off. I would assume that these are probabilities and not frequencies.
Here is the code I am using so far to generate the normal distribution :
// Mean & Std dev for calculating normal pdf
var mean = jStat.mean(data);
var stdev = jStat.stdev(data);
// get min & max for defining range of values for normal
var min = jStat.min(data);
var max = jStat.max(data);
// binNum = number of bins
var normData = jStat.seq(min, max, binNum, (x) => {
return jStat.normal.pdf(x, mean, stdev);
});
I've tried multiplying by the sample size (among other things), to no avail. Is there a way to convert the probabilities into frequencies or otherwise 'scale' the normal distribution?
Upvotes: 0
Views: 713
Reputation: 328
I finally solved this. Leaving it here for anyone coming down this path.
The solution was to multiply the resulting normal distribution values by the scale factor binSize * sampleSize
.
In simple terms, the area under the resulting normal distribution curve is 1 (by definition). The factor binSize * sampleSize
will give you the total area under the larger curve. So you scale the nomal distribution so that the areas are equal.
I'm not sure that is the best explanation, but here is some guidance on the solution. It's done in excel but it pointed me in the right direction.
Upvotes: 2