Reputation: 23
I was wondering if there was any way to plot this PDF and CDF in R. I found these on a different question a user asked, and was curious.
I know that I have to create a function and then plot this, but I'm struggling with the different parameters and am unsure how to translate this to R. I have only ever plotted PDF/CDF using a normal distribution, or from datasets.
Upvotes: 2
Views: 658
Reputation: 101383
We can try the code below
f <- function(x) 0.5 * (x >= 0) + 0.5 * (x >= 1) - (x >= 1.5)
F <- Vectorize(function(x) integrate(f, -Inf, x)$value)
plot(f, -2, 2, col = "red")
curve(F, -2, 2, add = TRUE, col = "blue")
legend(-1, 0.8,
legend = c("pdf", "cdf"),
col = c("red", "blue"),
lty = 1:2, cex = 2,
box.col = "white"
)
Upvotes: 0
Reputation: 72828
ifelse
can be very slow, we can fill an output vector instead. numeric()
creates a vector of zeroes of a specified length, we then simply change everything that should not yield zero.
pdf_vec <- function(x) {
out <- numeric(length(x))
out[x >= 0 & x < 1] <- .5
out[x >= 1 & x < 1.5] <- 1
out
}
cdf_vec <- function(x) {
out <- numeric(length(x))
tmp <- x >= 0 & x < 1
out[tmp] <- .5*x[tmp]
tmp <- x >= 1 & x < 1.5
out[tmp] <- x[tmp] - .5
tmp <- x >= 1.5
out[tmp] <- 1
out
}
set.seed(42)
x <- rnorm(1e6)
stopifnot(all.equal(cdf(x), cdf1(x)))
stopifnot(all.equal(pdf(x), pdf1(x)))
@Allan Camero already showed nicely how to plot it.
It's about three times faster than the ifelse
solution.
microbenchmark::microbenchmark(
cdf_ifelse=cdf(x), cdf_vec=cdf1(x), check='equal'
)
# Unit: milliseconds
# expr min lq mean median uq max neval cld
# cdf_ifelse 110.66285 121.25428 133.0789 133.86041 142.53296 167.64401 100 b
# cdf_vec 43.56277 45.08759 48.8924 46.83869 49.46047 74.95487 100 a
microbenchmark::microbenchmark(
pdf_ifelse=pdf(x), pdf_vec=pdf1(x), check='equal'
)
# Unit: milliseconds
# expr min lq mean median uq max neval cld
# pdf_ifelse 71.39501 76.71747 101.17738 81.43037 87.82162 192.3919 100 b
# pdf_vec 27.82642 30.02056 34.55301 30.38457 31.29751 133.2798 100 a
Upvotes: 2
Reputation: 173813
You can write the pdf and cdf as functions, using ifelse
to specify the values within the ranges, then simply plot the functions over the desired range:
pdf <- function(x) {
ifelse(x >= 0 & x < 1, 0.5, ifelse(x < 1.5 & x >= 1, 1, 0))
}
cdf <- function(x) {
ifelse(x < 0, 0,
ifelse(x >= 0 & x < 1, 0.5 * x,
ifelse(x < 1.5 & x >= 1, x - 0.5, 1)))
}
plot(pdf, xlim = c(-1, 2), type = "s")
plot(cdf, xlim = c(-1, 2))
Created on 2022-10-27 with reprex v2.0.2
Upvotes: 3