Reputation: 390

Extracting peak values from a density ggplot

Here is a minimal example of my problem.

I would like a vertical line passing through the two peaks in the graph and a label of its corresponding x axis.

    library(dplyr)
    library(ggplot2)

   data.frame(Values=c(1,1,1,2,2,3,3,5,4,5,5,5,5,6,6,6,6,5,5,5,
                        4,4,4,7,8,8,8,8,8,8,8,4,3,2,2,3,5,6,7,4,
                        2,2,3,9,9,9,9,9,9,9,9,9,9,5,1,3,3,1))%>%
    ggplot(aes(x=Values))+
      geom_density()+
      theme_classic()

Upvotes: 1

Answers (3)

teunbrand

Reputation: 37953

Adapting the code from an answer I gave to a similar question should also work here. We first extract the layer data from the plot itself to make sure we're not accedentily assuming different bandwidths or kernels than ggplot2. Like the answer Ben Bolker linked, this also makes use of run-length encoding of the differences.

library(ggplot2)

df <- data.frame(Values=c(1,1,1,2,2,3,3,5,4,5,5,5,5,6,6,6,6,5,5,5,
                    4,4,4,7,8,8,8,8,8,8,8,4,3,2,2,3,5,6,7,4,
                    2,2,3,9,9,9,9,9,9,9,9,9,9,5,1,3,3,1))
g <- ggplot(df, aes(x=Values))+
  geom_density()+
  theme_classic()

dens <- layer_data(g, 1)
dens <- dens[order(dens$x),]

# Run length encode the sign of difference
rle <- rle(diff(as.vector(dens$y)) > 0)
# Calculate startpoints of runs
starts <- cumsum(rle$lengths) - rle$lengths + 1
# Take the points where the rle is FALSE (so difference goes from positive to negative) 
maxima_id <- starts[!rle$values]

maxima <- dens[maxima_id,]

g + geom_vline(data = maxima,
               aes(xintercept = x))

^{Created on 2021-01-25 by the reprex package (v0.3.0)}

Upvotes: 4

giocomai

Reputation: 3518

For a slight variation, this gets the data out of the ggplot2 object, and then first finds the highest value overall, then the highest value for x above 7.

There's definitely more elegant ways to achieve the same.

The main difference here is that you are getting the data out of the ggplot2 object itself, and then you can process it as you like.

library(dplyr)
library(ggplot2)

plot <- data.frame(Values=c(1,1,1,2,2,3,3,5,4,5,5,5,5,6,6,6,6,5,5,5,
                    4,4,4,7,8,8,8,8,8,8,8,4,3,2,2,3,5,6,7,4,
                    2,2,3,9,9,9,9,9,9,9,9,9,9,5,1,3,3,1)) %>% 
  ggplot(aes(x=Values))+
  geom_density()+
  theme_classic() 
  
p <- ggplot_build(plot)


plot +
  geom_vline(xintercept = p$data[[1]]$x[which.max(p$data[[1]]$y)]) +
  geom_vline(xintercept = p$data[[1]]$x[p$data[[1]]$x>7][which.max(p$data[[1]]$y[p$data[[1]]$x>7])])

^{Created on 2021-01-25 by the reprex package (v0.3.0)}

Upvotes: 0

IceCreamToucan

Reputation: 28685

Building on Ben Bolker's comment:

This code doesn't strictly tell you all the mathematical local maxima, it just tells you points which are the max within a width-20 window whose endpoints don't equal the max.

values <- c(1,1,1,2,2,3,3,5,4,5,5,5,5,6,6,6,6,5,5,5,
                        4,4,4,7,8,8,8,8,8,8,8,4,3,2,2,3,5,6,7,4,
                        2,2,3,9,9,9,9,9,9,9,9,9,9,5,1,3,3,1)

dens <- density(values)

i1 <- 
  zoo::rollapply(dens$y, 20, function(x){
    if(head(x, 1) < max(x) & tail(x, 1) < max(x))
      which.max(x)
    else 
      NA
  })

max_inds <- unique(i1 + seq_along(i1) - 1)
x_at_max <- dens$x[max_inds[!is.na(max_inds)]]


data.frame(Values=c(1,1,1,2,2,3,3,5,4,5,5,5,5,6,6,6,6,5,5,5,
                        4,4,4,7,8,8,8,8,8,8,8,4,3,2,2,3,5,6,7,4,
                        2,2,3,9,9,9,9,9,9,9,9,9,9,5,1,3,3,1))%>%
    ggplot(aes(x=Values))+
      geom_density()+
      theme_classic() +
    geom_vline(xintercept = x_at_max)

Edit:

Actually, looks like this gives the same result as the approach in the answer linked by Ben Bolker. You should probably use that one. I assume it's faster, and also it will give you all the local maximua even if they're not the max of a width-20 window.

r <- rle(dens$y)
which(rep(x = diff(sign(diff(c(-Inf, r$values, -Inf)))) == -2, 
          times = r$lengths))
# [1] 240 378

max_inds[!is.na(max_inds)]
# [1] 240 378

Upvotes: 3

Extracting peak values from a density ggplot

Answers (3)

Related Questions