Jolanda Kossakowski
Jolanda Kossakowski

Reputation: 461

setting upper and lower limits in rnorm

I am simulating data using rnorm, but I need to set an upper and lower limit, does anyone know how to do this?

code:

rnorm(n = 10, mean = 39.74, sd = 25.09)

Upper limit needs to be 340, and the lower limit 0

I am asking this question because I am rewriting an SAS-code into an R-code. I have never used SAS. I am trying to rewrite the following piece of code:

sim_sample(simtot=100000,seed=10004,lbound=0,ubound=340,round_y=0.01,round_m=0.01,round_sd=0.01,n=15,m=39.74,sd=25.11,mk=4)

Upvotes: 24

Views: 40311

Answers (6)

GKi
GKi

Reputation: 39657

There are several ways to set upper and lower limits to a normal distribution, what will cause that the result is no longer normal distributed.

Assuming a mean=0, sd=1 producing N=1e5 values with a lower boundary of LO=-1 and an upper boundary of UP=2.

N <- 1e5L
LO <- -1
UP <- 2

Move outliers to border (@Roland)

set.seed(42)
x <- pmax(LO, pmin(UP, rnorm(N)))
mean(x)
#[1] 0.07238029
median(x)
#[1] -0.002066374
sd(x)
#[1] 0.8457605
hist(x, 30)

hist of Set outliers to borders

Cut outliers of (@Dason, @Roland, truncnorm::rtruncnorm, MCMCglmm::rtnorm)

set.seed(42)
x <- qnorm(runif(N, pnorm(LO), pnorm(UP)))
mean(x)
#[1] 0.2317875
median(x)
#[1] 0.173679
sd(x)
#[1] 0.7236536

hist of Cut outliers of

Scale (@Alex Essilfie)

set.seed(42)
x <- rnorm(N)
x <- (x-min(x))/(max(x)-min(x))*(UP-LO)+LO
mean(x)
#[1] 0.4474876
median(x)
#[1] 0.4482257
sd(x)
#[1] 0.3595199

hist of scale

Combination of methods. E.g. Cut and scale:

set.seed(42)
x <- qnorm(runif(N, pnorm(-3), pnorm(3)))
x <- (x-min(x))/(max(x)-min(x))*(UP-LO)+LO
mean(x)
#[1] 0.5010759
median(x)
#[1] 0.5014713
sd(x)
#[1] 0.4957751

hist of Combination

Asymmetric combination

set.seed(42)
n <- round(N*abs(LO)/diff(range(c(LO, UP))))
x <- c(qnorm(runif(n, pnorm(-3), 0.5)), qnorm(runif(N-n, 0.5, pnorm(3))))
x <- ifelse(x < 0, x/min(x)*LO, x/max(x)*UP)
mean(x)
#[1] 0.2651627
median(x)
#[1] 0.2127903
sd(x)
#[1] 0.5078264

hist of asymmetric combination

Upvotes: 0

Alex Essilfie
Alex Essilfie

Reputation: 12613

This is the function that I wrote to achieve the same purpose. It normalizes the result from the rnorm function and then adjusts it to fit the range.

NOTE: The standard deviation and mean (if specified) get altered during the normalization process.

#' Creates a random normal distribution within the specified bounds.
#' 
#' WARNING: This function does not preserve the standard deviation or mean.
#' @param n The number of values to be generated
#' @param mean The mean of the distribution
#' @param sd The standard deviation of the distribution
#' @param lower The lower limit of the distribution
#' @param upper The upper limit of the distribution
rtnorm <- function(n, mean=NA, sd=1, lower=-1, upper=1){
  mean = ifelse(is.na(mean)|| mean < lower || mean > upper,
                mean(c(lower, upper)), mean)
  data <- rnorm(n, mean=m, sd=sd) # data

  if (!is.na(lower) && !is.na(upper)){ # adjust data to specified range
    drange <- range(data)           # data range
    irange <- range(lower, upper)   # input range
    data <- (data - drange[1])/(drange[2] - drange[1]) # normalize data (make it 0 to 1)
    data <- (data * (irange[2] - irange[1]))+irange[1] # adjust to specified range
  }
  return(data)
}

Upvotes: 0

Dason
Dason

Reputation: 61933

You can make your own truncated normal sampler that doesn't require you to throw out observations quite simply

rtnorm <- function(n, mean, sd, a = -Inf, b = Inf){
    qnorm(runif(n, pnorm(a, mean, sd), pnorm(b, mean, sd)), mean, sd)
}

Upvotes: 15

charlie
charlie

Reputation: 622

The rtruncnorm() function will return the results you need.

  library(truncnorm)
  rtruncnorm(n=10, a=0, b=340, mean=39.4, sd=25.09)

Upvotes: 32

Roland
Roland

Reputation: 132706

Like this?

mysamp <- function(n, m, s, lwr, upr, nnorm) {
  samp <- rnorm(nnorm, m, s)
  samp <- samp[samp >= lwr & samp <= upr]
  if (length(samp) >= n) {
    return(sample(samp, n))
  }  
  stop(simpleError("Not enough values to sample from. Try increasing nnorm."))
}

set.seed(42)
mysamp(n=10, m=39.74, s=25.09, lwr=0, upr=340, nnorm=1000)
#[1] 58.90437 38.72318 19.64453 20.24153 39.41130 12.80199 59.88558 30.88578 19.66092 32.46025

However, the result is not normal distributed and usually won't have the mean and sd you've specified (in particular if the limits are not symmetric around the specified mean).

Edit:

According to your comment it seems you want to translate this SAS function. I am not an SAS user, but this should do more or less the same:

mysamp <- function(n, m, s, lwr, upr, rounding) {
  samp <- round(rnorm(n, m, s), rounding)
  samp[samp < lwr] <- lwr
  samp[samp > upr] <- upr
  samp
}

set.seed(8)
mysamp(n=10, m=39.74, s=25.09, lwr=0, upr=340, rounding=3)
#[1] 37.618 60.826 28.111 25.920 58.207 37.033 35.467 12.434  0.000 24.857

You may then want to use replicate to run the simulations. Or if you want faster code:

sim <- matrix(mysamp(n=10*10, m=39.74, s=25.09, lwr=0, upr=340, rounding=3), 10)
means <- colMeans(sim)
sds <- apply(sim, 2, sd)

Upvotes: 10

alexis_laz
alexis_laz

Reputation: 13122

Assuming you want exactly 10 numbers and not the subset of them that is >0, <340 (and night not be a normal distribution):

    aa <- rnorm(n = 10, mean = 39.74, s = 25.09)

    while(any(aa<0 | aa>340)) { aa <- rnorm(n = 10, mean = 39.74, s = 25.09) }

Upvotes: 0

Related Questions