Emma
Emma

Reputation: 23

Finding multiple peak densities on facet wrapped ggplot for two datasets

I am currently attempting to plot densities of flies on julian dates, per year. The aim is to see when there are peak densities of flies, for two methods of data collection (group 1 and group 2). I have many rows of data, over the course of 10 years, for example, the data set looks like this:

year julian group
2000 214 1
2001 198 1
2001 224 1
2000 189 2
2000 214 2
2001 222 2
2001 259 2
2000 260 2
2000 212 1

Each row is a single observation. This is my first time plotting using ggplots, so I am confused as to how to plot vertical peak lines for each year. The code currently looks like this:

Code

data$group <- as.factor(data$group)

plots <- ggplot(data, aes(x = julian, group = group)) +
  geom_density(aes(colour = group),adjust = 2) + facet_wrap(~year, ncol = 2) 

I have attempted to plot peaks using this code:

geom_vline(data = vline, aes(xintercept = density(data$julian)$x[which.max(density(data$julian)$y)]))

vline <- summarise(group_by(data,year, group=group), density(ata$julian, group=group)$x[which.max(density(data$julian)$y)])

vline

However I assume it has found the peak density for all years and all groups. Please may anyone help advise me on how to plot max densities for each year and group across each facet? Even better if there are multiple peaks, how would I find those, and a quantitative value for the peaks?

Thank you in advance, I am very new to ggplots.

Upvotes: 2

Views: 315

Answers (1)

stefan
stefan

Reputation: 124268

Instead of trying to wrangle all computations into one line of code I would suggest to split it into steps like so. Instead of using your code to find the highest peak I make use of this answer which in principle should also find multiple peaks (see below):


library(dplyr)
library(ggplot2)

fun_peak <- function(x, adjust = 2) {
  d <- density(x, adjust = adjust)
  d$x[c(F, diff(diff(d$y) >= 0) < 0)]
}

vline <- data %>%
  group_by(year, group) %>%
  summarise(peak = fun_peak(julian))
#> `summarise()` has grouped output by 'year'. You can override using the `.groups` argument.

ggplot(data, aes(x = julian, group = group)) +
  geom_density(aes(colour = group), adjust = 2) +
  geom_vline(data = vline, aes(xintercept = peak)) +
  facet_wrap(~year, ncol = 2)

And here is a small example with multiple peaks based on the example data in the linked answer:

x <- c(1,1,4,4,9)

data <- data.frame(
  year = 2000,
  julian = rep(c(1,1,4,4,9), 2),
  group = rep(1:2, each = 5)
)
data$group <- as.factor(data$group)

vline <- data %>%
  group_by(year, group) %>%
  summarise(peak = fun_peak(julian, adjust = 1))
#> `summarise()` has grouped output by 'year', 'group'. You can override using the `.groups` argument.

ggplot(data, aes(x = julian, group = group)) +
  geom_density(aes(colour = group), adjust = 1) +
  geom_vline(data = vline, aes(xintercept = peak)) +
  facet_wrap(~year, ncol = 2)

Upvotes: 2

Related Questions