Reputation: 1740
I am interested into drawing random samples from a uniform distribution which is spread across 2 intervals: from -2 to -1 and from 1 to 2. Any value which belongs to these intervals should have equal probability of being sampled and the other values (e.g. from -1 to 1) should have 0 probability from being sampled.
The runif
function only accepts single numbers as arguments for the upper and lower limit. Thus, it is impossible to make that function sample from two intervals.
Another option would be running two runif
functions, one on the interval from -2 to -1, and one on the interval from 1 to 2. However, in this case I explicitly make it so that in each draw there will be an equal number of numbers drawn from each interval.
I want to allow this to vary and, while on the average there will be roughly the same amount of positive and negative numbers, I still need to see some random variation, e.g. draws where more numbers were drawn from the positive interval or vice versa.
Upvotes: 4
Views: 762
Reputation: 39667
Maybe add the gap afterwards like:
n <- 10
x <- runif(n, -2, 0)
i <- x > -1
x[i] <- x[i] + 2
As a Function:
runifGap <- function(n, xMin, xMax, gapMin, gapMax) {
dGap <- gapMax- gapMin
x <- runif(n, xMin, xMax - dGap)
i <- x>gapMin
x[i] <- x[i] + dGap
x
}
runifGap(10, -2, 2, -1, 1)
Or in case of more than one gap:
runifGaps <- function(n, r) {
d <- diff(r)
dj <- d[c(FALSE, TRUE)]
x <- runif(n, r[1], r[1] + sum(d[c(TRUE, FALSE)]))
cd <- r[c(FALSE, TRUE)]
for(i in seq_along(dj)) {
j <- which(x > cd[i])
x[j] <- x[j] + dj[i]
}
x
}
runifGaps(10, c(-2, -1, 1, 2))
runifGaps(10, c(-2, -1, 1, 2, 4, 6)) #for ranges -2:-1, 1:2 and 4:6
Or using findInterval
instead of a for loop.
runifGapsB <- function(n, r) {
d <- diff(r)
x <- runif(n, r[1], r[1] + sum(d[c(TRUE, FALSE)]))
x + c(0, cumsum(d[c(FALSE, TRUE)]))[1 +
findInterval(x, r[1] + cumsum(head(d[c(TRUE, FALSE)], -1)))]
}
Benchmark
n <- 1e7
bench::mark(check = FALSE,
shs = runif_multiinterval(n = n, intervals = tribble(~min, ~max, -2, -1, 1, 2)),
"M.Viking" = ifelse(runif(n)>0.5, runif(n,-2,-1), runif(n,1,2)),
"Stéphane Laurent" = (2L * sample.int(2L, size=n, replace = TRUE) - 3L) * runif(n, 1, 2),
LaurentGKi = sample(c(-1, 1), n, TRUE) * runif(n, 1, 2),
LaurentGKi2 = c(-1L, 1L)[sample.int(2L, n, TRUE)] * runif(n, 1, 2),
GKi = {x <- runif(n, -2, 0); i <- x>-1; `[<-`(x, i, x[i] + 2)},
GKi2 = {x <- runif(n, -2, 0); i <- which(x>-1); `[<-`(x, i, x[i] + 2)},
GKi3 = runifGaps(n, c(-2, -1, 1, 2))
)
# expression min median `itr/sec` mem_alloc `gc/sec` n_itr n_gc
# <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl> <int> <dbl>
#1 shs 1.12s 1.12s 0.891 305MB 1.78 1 2
#2 M.Viking 519.98ms 519.98ms 1.92 763MB 5.77 1 3
#3 Stéphane Laurent 274.15ms 276.7ms 3.61 114MB 1.81 2 1
#4 LaurentGKi 270.67ms 270.76ms 3.69 191MB 1.85 2 1
#5 LaurentGKi2 263.47ms 266.96ms 3.75 153MB 1.87 2 1
#6 GKi 214.68ms 218.12ms 4.25 343MB 7.08 3 5
#7 GKi2 191.72ms 200.83ms 5.05 286MB 6.73 3 4
#8 GKi3 168.85ms 170.49ms 5.36 210MB 5.36 3 3
In this example the methods of GKi are the fastest and the method of Stéphane Laurent uses the lowest amount of memory.
Upvotes: 4
Reputation: 3901
I wrote a small function to generalize uniform sampling across multiple intervals. It uses 2-stage sampling. First, the number of units per interval is drawn from the multinomial distribution, weighted by the respective interval width. Second, runif
is applied to those draws with the specified interval parameters.
library(dplyr)
library(purrr)
#' Draw uniformly across multiple intervals.
#' @param n number of output elements
#' @param intervals data frame of intervals. Must have 2 columns: min, max
runif_multiinterval <- \(n, intervals) {
intervals |>
mutate(n = rmultinom(1, n, prob = max - min)) |> # 1st stage: draw group sizes
pmap(runif) |> # 2nd stage: draw within intervals
unlist() |>
sample() # randomize order
}
Works for your case:
runif_multiinterval(
n = 1e5,
intervals = tribble(
~min, ~max,
-2, -1,
1, 2
)) |>
hist()
Works for multiple, differently sized intervals:
runif_multiinterval(
n = 1e5,
intervals = tribble(
~min, ~max,
-2, -1,
1, 2,
4, 6
)) |>
hist()
Created on 2023-02-13 with reprex v2.0.2
Upvotes: 3
Reputation: 84529
Since this is symmetric around 0 you can simulate a random sign and assign it to a uniform value in [1,2]:
nsims <- 1e6L
random_sign <- 2L * sample.int(2L, size=nsims, replace = TRUE) - 3L
sims <- random_sign * runif(nsims, 1, 2)
hist(sims)
Upvotes: 4
Reputation: 5398
Edit: As my first answer led to exactly equal amounts of positive and negative values, here is another way that executes runif
3 times:
ifelse(runif(1000)>0.5, runif(1000,-2,-1), runif(1000,1,2))
Appears to be undocumented, but experimentation led me to:
runif(1000, min=c(-2, 1), max=c(-1, 2))
example:
hist(runif(1000000, min=c(-2, 1), max=c(-1, 2)))
Upvotes: 4