Reputation: 7517
I'm trying to first extract all values <= -4
(call these p1
) from a mother
normal distribution. Then, randomly sample 50 of p1
s with replacement according to their probability of being selected in the mother
(call these 50s p2
). For example, -4
is more likely to be selected than -6
which is further into the tail area.
I was wondering if my R code below correctly captures what I described above?
mother <- rnorm(1e6)
p1 <- mother[mother <= -4]
p2 <- sample(p1, 50, replace = T) # How can I define probability of being selected here?
Upvotes: 2
Views: 955
Reputation: 3504
Wouldn't it be easier to sample from a truncated normal distribution in the first place?
truncnorm::rtruncnorm(50, a = -Inf, b = -4)
Upvotes: 1
Reputation: 3116
I think you are looking for something like this:
mother <- rnorm(1e6)
p1 <- mother[mother <= -4]
Calculate probability of p1 getting selected from mother
p2 <- sample(p1, 50, replace = T,prob = pnorm(p1,mean = mean(mother),sd = sd(mother)))
Upvotes: 0
Reputation: 76651
You can use function sample
argument prob
. Quoting from help("sample")
:
prob a vector of probability weights for obtaining the elements of the vector being sampled.
And in the section Details
:
The optional prob argument can be used to give a vector of weights for obtaining the elements of the vector being sampled. They need not sum to one, but they should be non-negative and not all zero.
So you must be careful, the more distant from the mean value the smaller the probabilities, the normal distribution drops to small values of probability very quickly.
set.seed(1315) # Make the results reproducible
mother <- rnorm(1e6)
p1 <- mother[mother <= -4]
p2 <- sample(p1, 50, replace = T, prob = pnorm(p1))
You can see that it worked with the histogram.
hist(p2)
Upvotes: 1