mafalda
mafalda

Reputation: 31

Create m samples with size n Normal Distribution

I have to create m=1450 samples of size n where n{100,200,300...5000} of an X population with normal distribution of mean 9,22 and variance=1.62^2. For each sample i need to create confidence intervals for the mean with the assumption that variance is unknown (significance level=0.98). Then for each n i have to calculate the mean of amplitude of the confidence intervals.

For the first part (create m samples of size n) i tried the following, without success:

sizes <- seq(100,5000,by=100)
m<-1450
normal=for (i in sizes){
replicate(1450, rnorm(i,9.22,sqrt(1.62^2), simplify=FALSE)
}

For the confidence intervals can I use something like the following?:

get.conf.int = function(x) t.test(x, conf.level = 0.98)$conf.int
conf.int = apply(normal, 2, get.conf.int)

Upvotes: 2

Views: 337

Answers (2)

jay.sf
jay.sf

Reputation: 72984

This strategy should do the trick (example simplified for better display).

sizes <- seq(5, 7, by=1)
set.seed(42)
samp <- lapply(sizes, \(n) replicate(10, rnorm(n, 9.22, sqrt(1.62^2)), 
                                     simplify=FALSE)) |>
  setNames(paste0('N', sizes))

Result

str(samp)
# List of 3
# $ N5:List of 10
# ..$ : num [1:5] 11.44 8.31 9.81 10.25 9.87
# ..$ : num [1:5] 9.05 11.67 9.07 12.49 9.12
# ..$ : num [1:5] 11.33 12.92 6.97 8.77 9
# ..$ : num [1:5] 10.25 8.76 4.92 5.27 11.36
# ..$ : num [1:5] 8.72 6.33 8.94 11.19 12.29
# ..$ : num [1:5] 8.52 8.8 6.36 9.97 8.18
# ..$ : num [1:5] 9.96 10.36 10.9 8.23 10.04
# ..$ : num [1:5] 6.44 7.95 7.84 5.31 9.28
# ..$ : num [1:5] 9.55 8.64 10.45 8.04 7
# ..$ : num [1:5] 9.92 7.91 11.56 8.52 10.28
# $ N6:List of 10
# ..$ : num [1:6] 9.74 7.95 11.77 10.26 9.37 ...
# ..$ : num [1:6] 10.32 9.37 4.37 9.68 8.63 ...
# ..$ : num [1:6] 10.16 11.49 8.04 11.33 9.76 ...
# ..$ : num [1:6] 10.71 10.39 7.53 9.07 10.23 ...
# ..$ : num [1:6] 8.34 10.16 10.46 9.97 7.79 ...
# ..$ : num [1:6] 11.67 9.64 9.36 9.02 7.29 ...
# ..$ : num [1:6] 8.87 8.92 10.73 10.55 11.48 ...
# ..$ : num [1:6] 10.27 11.47 7.42 7.83 7.39 ...
# ..$ : num [1:6] 9.35 10.28 11.17 10.91 7.59 ...
# ..$ : num [1:6] 8.14 9.39 8.54 9.02 9.52 ...
# $ N7:List of 10
# ..$ : num [1:7] 9.18 9.4 8.43 8.4 6.53 ...
# ..$ : num [1:7] 13.6 7.01 9.44 6.8 6.84 ...
# ..$ : num [1:7] 9.22 8.53 8.23 5.94 7.24 ...
# ..$ : num [1:7] 8.42 9.22 11.04 11.55 7.44 ...
# ..$ : num [1:7] 8.46 9.13 9.08 7.78 8.5 ...
# ..$ : num [1:7] 11.02 8.44 8.52 10.35 7.51 ...
# ..$ : num [1:7] 11.11 8.78 8.46 7.21 9.21 ...
# ..$ : num [1:7] 11.31 8.94 7.48 9.48 8.63 ...
# ..$ : num [1:7] 7.61 9.96 9.36 10.67 8.85 ...
# ..$ : num [1:7] 11.96 10.62 8.98 6.87 10.26 ...

Confidence intervals

ci <- rapply(samp, \(x) c(mean(x), t.test(x, conf.level=0.98)$conf.int), how='l')

ci
# $N5
# $N5[[1]]
# [1]  8.661997  5.349134 11.974859
# 
# $N5[[2]]
# [1]  9.66209  7.76908 11.55510
# 
# $N5[[3]]
# [1]  9.164220  6.994262 11.334178
# 
# $N5[[4]]
# [1]  9.198121  6.630857 11.765386
# 
# $N5[[5]]
# [1]  8.579704  7.149364 10.010045
# 
# $N5[[6]]
# [1] 10.399805  7.511904 13.287707
# 
# $N5[[7]]
# [1]  9.920176  8.208620 11.631733
# 
# $N5[[8]]
# [1]  9.718574  9.220217 10.216931
# 
# $N5[[9]]
# [1] 10.446734  8.075362 12.818106
# 
# $N5[[10]]
# [1]  9.343241  6.176216 12.510266
# 
# 
# $N6
# $N6[[1]]
# [1]  9.029016  7.801650 10.256383
# 
# $N6[[2]]
# [1]  8.958646  6.354959 11.562332
# 
# $N6[[3]]
# [1]  8.805205  6.649026 10.961384
# 
# $N6[[4]]
# [1] 10.198475  7.999154 12.397796
# 
# $N6[[5]]
# [1]  8.947646  7.815356 10.079936
# 
# $N6[[6]]
# [1]  9.858800  8.045126 11.672473
# 
# $N6[[7]]
# [1]  9.646197  8.544677 10.747718
# 
# $N6[[8]]
# [1]  9.378679  7.294165 11.463194
# 
# $N6[[9]]
# [1]  9.005310  6.037539 11.973082
# 
# $N6[[10]]
# [1]  9.444443  6.934462 11.954424
# 
# 
# $N7
# $N7[[1]]
# [1] 10.001309  7.422273 12.580345
# 
# $N7[[2]]
# [1]  9.165244  6.480325 11.850162
# 
# $N7[[3]]
# [1]  9.506281  8.116342 10.896221
# 
# $N7[[4]]
# [1]  9.280690  8.137453 10.423928
# 
# $N7[[5]]
# [1]  9.830176  8.354489 11.305863
# 
# $N7[[6]]
# [1]  9.377695  7.933564 10.821826
# 
# $N7[[7]]
# [1] 8.028952 6.493226 9.564679
# 
# $N7[[8]]
# [1]  9.552392  8.976136 10.128648
# 
# $N7[[9]]
# [1]  9.452145  7.787782 11.116508
# 
# $N7[[10]]
# [1]  8.990971  7.091184 10.890757

Note:

>  R.version.string
[1] "R version 4.1.2 (2021-11-01)"

Upvotes: 1

Oliver
Oliver

Reputation: 8572

assigning a for loop doesn't make sense in R terms, instead you can use lapply which'll return the entire output as a list or sapply which will try to convert to a array if possible. Here returning a list makes more sense. Note below I use lambda function \(x) = function(x) which were introduced in R 4.1.0

m <- 1450
sizes <- 1:50
size_factor <- 100
alpha <- 0.02
# Simulate outcomes
sims <- lapply(sizes * size_factor, \(x){
  replicate(m, rnorm(x, 9.22, 1.62))
})
# Find 1 - alpha confidence intervals
lapply(sims, \(x){
  # Find simulated means
  mus <- colMeans(x)
  # Find S.E of the simulated means (assuming same variance)
  s <- sd(mus)
  # Make confidence interval
  mean(mus) + c(-1, 1) * s * qnorm(1 - alpha/2)
})

Upvotes: 0

Related Questions