Reputation: 166
I have a process A that captures 86400 sample points per day from a system B. I am repeating the process A for 23 days. After 23 days, I have 23 mean and 23 standard deviation (sd) values. I am trying to come up with a normal distribution for this entire process. For constructing a normal distribution, I need a representative mean and standard deviation value. For representative mean, I can take the average of all 23 means, but I am not sure what will be the representative for 23 standard deviations.
Is it right to consider average of all standard deviation values as the representative standard deviation for the entire process?
All the 86400 samples points are random numbers between 0 and 20.
Upvotes: 0
Views: 699
Reputation: 173
It is unclear what you mean by "trying to come up with a normal distribution for this entire process", but I hope this helps:
You have a list means, to get the representative mean, you are doing the right thing by taking the average of them. To get the representative of standard deviation, take the 23 means that you calculated and find the standard deviation of them as a data set. Below is some R code I hope you can translate to fit your needs.
data <- processA_runFor23Days()
daily_means <- getMeanForEachDay(data) #this should be a vector of length 23
sd(daily_means)
Where "daily_means" are the means for each day. I think this should be ok since each day has the same number of data points.
EDIT: To be more clear, lets say that you have the means for each of the 23 days
> daily_means
[1] 0.59073346 0.66107694 0.32187724 0.60259824 0.92803502 0.82414235
[7] 0.21125403 0.61161841 0.48346220 0.86058580 0.87253787 0.94609922
[13] 0.40849556 0.96766218 0.49403126 0.38261995 0.02554012 0.19892710
[19] 0.55517159 0.71836176 0.53599262 0.67525105 0.25059165
Ignore the the standard deviations for each of the days, they no longer matter. Your new distribution is now the the means from each day. So take the mean and the standard deviation of these 23 numbers.
> mean(daily_means)
[1] 0.5707246
> sd(daily_means)
[1] 0.2624342
Upvotes: 0