Reputation: 4201
I thought this should be straightforward, but I'm lost, despite tons of information online.
My Problem: I have a vector of data points, for which I want to plot a density curve, then color the area under the curve to signify the Highest Density Interval (HDI). Naturally, I'm trying to achieve this with ggplot2
package, and specifically with qplot()
, since my data comes as a vector, and not a data frame.
library(ggplot2)
library(HDInterval)
## create data vector
set.seed(789)
dat <- rnorm(1000)
## plot density curve with qplot and mark 95% hdi
qplot(dat, geom = "density")+
geom_vline(aes(xintercept = c(hdi(dat))))
So I get this:
But what I really want is something like this:
Is there a simple way to achieve this with ggplot2::qplot
?
Upvotes: 3
Views: 1367
Reputation: 55
When I read this post I was realy grateful for your answer, Wilke. But I wondered how to adjust the credible mass of the hdi. Finally I found a solution! When I figuered out where the quantiles argument comes from (I access it via dots[[2]]) it clicked the trigger. I wrote the following function (because passing the value to HDInterval::hdi didn't work out of the box):
hdi_custWidth <- function(...) {
dots <- list(...)
quantiles <- dots[[2]]
hdi_width <- quantiles[[length(quantiles)]] # uses the last entry if its a vector which should be the biggest one; better pass a single double < 1.0
if (is.na(hdi_width)) hdi_width <- .89 # happens is quantiles = 1L
message(paste0('HDI credible interval width = ', hdi_width))
HDInterval::hdi(dots[[1]], credMass = hdi_width)
}
You can use it to alter the repex from the post above:
library(tidyverse)
library(HDInterval)
library(ggridges)
#>
#> Attaching package: 'ggridges'
#> The following object is masked from 'package:ggplot2':
#>
#> scale_discrete_manual
## create data vector
set.seed(789)
dat <- rnorm(1000)
df <- tibble(dat)
## plot density curve with qplot and mark 95% hdi
ggplot(df, aes(x = dat, y = 0, fill = stat(quantile))) +
geom_density_ridges_gradient(quantile_lines = TRUE, quantile_fun = hdi_custWidth, quantiles = .90, vline_linetype = 2) +
scale_fill_manual(values = c("transparent", "lightblue", "transparent"), guide = "none")
#> Picking joint bandwidth of 0.227
Of course you can pick any value between 0 and 1 in the quantiles argument (not just .90) and get the corresponding hdi credible mass.
Upvotes: 1
Reputation: 17790
You can do this with the ggridges package. The trick is that we can provide HDInterval::hdi
as quantile function to geom_density_ridges_gradient()
, and that we can fill by the "quantiles" it generates. The "quantiles" are the numbers in the lower tail, in the middle, and in the upper tail.
As a general point of advice, I would recommend against using qplot()
. It's more likely going to cause confusion, and putting a vector into a tibble is not a lot of effort.
library(tidyverse)
library(HDInterval)
library(ggridges)
#>
#> Attaching package: 'ggridges'
#> The following object is masked from 'package:ggplot2':
#>
#> scale_discrete_manual
## create data vector
set.seed(789)
dat <- rnorm(1000)
df <- tibble(dat)
## plot density curve with qplot and mark 95% hdi
ggplot(df, aes(x = dat, y = 0, fill = stat(quantile))) +
geom_density_ridges_gradient(quantile_lines = TRUE, quantile_fun = hdi, vline_linetype = 2) +
scale_fill_manual(values = c("transparent", "lightblue", "transparent"), guide = "none")
#> Picking joint bandwidth of 0.227
Created on 2019-12-24 by the reprex package (v0.3.0)
The colors in scale_fill_manual()
are in the order of the three groups, so if you, for example, only wanted to shade the left tail, you would write values = c("lightblue", "transparent", "transparent")
.
Upvotes: 5