Reputation: 7517

Sampling from a specific part of a normal distribution in R

I'm trying to first extract all values <= -4 (call these p1) from a mother normal distribution. Then, randomly sample 50 of p1s with replacement according to their probability of being selected in the mother (call these 50s p2). For example, -4 is more likely to be selected than -6 which is further into the tail area.

I was wondering if my R code below correctly captures what I described above?

mother <- rnorm(1e6)
p1 <- mother[mother <= -4]
p2 <- sample(p1, 50, replace = T) # How can I define probability of being selected here?

Upvotes: 2

Answers (3)

hplieninger

Reputation: 3504

Wouldn't it be easier to sample from a truncated normal distribution in the first place?

truncnorm::rtruncnorm(50, a = -Inf, b = -4)

Upvotes: 1

tushaR

Reputation: 3116

I think you are looking for something like this:

mother <- rnorm(1e6)
p1 <- mother[mother <= -4]

Calculate probability of p1 getting selected from mother

p2 <- sample(p1, 50, replace = T,prob = pnorm(p1,mean = mean(mother),sd = sd(mother)))

Upvotes: 0

Rui Barradas

Reputation: 76651

You can use function sample argument prob. Quoting from help("sample"):

prob a vector of probability weights for obtaining the elements of the vector being sampled.

And in the section Details:

The optional prob argument can be used to give a vector of weights for obtaining the elements of the vector being sampled. They need not sum to one, but they should be non-negative and not all zero.

So you must be careful, the more distant from the mean value the smaller the probabilities, the normal distribution drops to small values of probability very quickly.

set.seed(1315)    # Make the results reproducible

mother <- rnorm(1e6)
p1 <- mother[mother <= -4]

p2 <- sample(p1, 50, replace = T, prob = pnorm(p1))

You can see that it worked with the histogram.

hist(p2)

Upvotes: 1

Sampling from a specific part of a normal distribution in R

Answers (3)

Related Questions