Reputation: 75
I have a geom_smooth that has an x-axis date, y-axis COVID cases, and then two categories. I'm trying to plot the maximum peak.
# Reproducible data
library(tidyverse)
df <- tribble(~date, ~cases, ~category,
"2021/1/1", 100, "A",
"2021/1/1", 103, "B",
"2021/1/2", 108, "A",
"2021/1/2", 109, "B",
"2021/1/3", 102, "A",
"2021/1/3", 120, "B",
"2021/1/4", 150, "A",
"2021/1/4", 160, "B",
"2021/1/5", 120, "A",
"2021/1/5", 110, "B",
"2021/1/6", 115, "A",
"2021/1/6", 105, "B",)
# Plotting geom_smooth
df %>%
ggplot(df, mapping = aes(date, cases, group = category, color = category)) +
geom_smooth()
How do I add the maximum peak to the geom_smooth? Ideally, I want both a point and a text that says what the peak case is.
I tried finding the peaks outside of the ggplot code - but it returns a different peak because the geom_smooth is creating its own function, not simply the mean of that category.
The response below worked, but I want to move the labels to make it more legible, but geom_text_repel seems to only refer to the first curve rather than both. Any advice?
library(ggplot2)
library(tidyverse)
library(ggrepel)
# Fake data
ar =hist(rnorm(10000,1), breaks = 180, plot=F)$counts
br =hist(rnorm(11000,1), breaks = 180, plot=F)$counts
df <- rbind(
tibble(category="B", date = seq(as.Date("2021-01-01"),by=1, length.out=length(br)),value=br),
tibble(category="A", date = seq(as.Date("2021-01-01"),by=1, length.out=length(ar)),value=ar)
)
# create the smooth and retain rows with max of smooth, using slice_max
sm_max = df %>% group_by(category) %>%
mutate(smooth =predict(loess(value~as.numeric(date), span=.5))) %>%
slice_max(order_by = smooth)
# Plot, using the same smooth as above (default is loess, span set at set above)
df %>%
ggplot(df, mapping = aes(date, value, group = category, color = category)) +
geom_point() +
geom_smooth(span=.5, se=F) +
geom_point(data=sm_max, aes(y=smooth),color="black", size=5) +
geom_text_repel(data = sm_max, aes(label=paste0("Peak: ",round(smooth,1))), color="black")
geom_text_repel(data = sm_max_p3, aes(x = date,
y = smooth,
label = paste0(candidate, " Peak: ",round(smooth,1))
Upvotes: 3
Views: 2124
Reputation: 24877
You need to generate the smooth first, and identify the max. You can then either
geom_smooth()
call, making sure to use the same smooth in geom_smooth that you did when generating and identifying the max.Here is an example, which uses the latter of these two options
# Fake data
ar =hist(rnorm(10000,1), breaks = 180, plot=F)$counts
br =hist(rnorm(25000,1), breaks = 180, plot=F)$counts
df = rbind(
tibble(category="B", date = seq(as.Date("2021-01-01"),by=1, length.out=length(br)),value=br),
tibble(category="A", date = seq(as.Date("2021-01-01"),by=1, length.out=length(ar)),value=ar)
)
# create the smooth and retain rows with max of smooth, using slice_max
sm_max = df %>% group_by(category) %>%
mutate(smooth =predict(loess(value~as.numeric(date), span=.5))) %>%
slice_max(order_by = smooth)
# Plot, using the same smooth as above (default is loess, span set at set above)
df %>%
ggplot(df, mapping = aes(date, value, group = category, color = category)) +
geom_point() +
geom_smooth(span=.5, se=F) +
geom_point(data=sm_max, aes(y=smooth),color="black", size=5) +
geom_text(data = sm_max, aes(y=smooth, label=paste0("Peak: ",round(smooth,1))), color="black")
Upvotes: 1
Reputation: 5254
If you're just looking to label the maximum measured value, you can use {gghighlight} to show and label only that point on top of the smoothed curve. Also your date
is a character
so it's a discrete variable. Therefore your geom_smooth()
is just a point-to-point line. Here, I convert it to a continuous variable with mutate(date = lubridate::ymd(date))
.
library(tidyverse)
library(lubridate)
library(gghighlight)
df <- tribble(~date, ~cases, ~category,
"2021/1/1", 100, "A",
"2021/1/1", 103, "B",
"2021/1/2", 108, "A",
"2021/1/2", 109, "B",
"2021/1/3", 102, "A",
"2021/1/3", 120, "B",
"2021/1/4", 150, "A",
"2021/1/4", 160, "B",
"2021/1/5", 120, "A",
"2021/1/5", 110, "B",
"2021/1/6", 115, "A",
"2021/1/6", 105, "B",)
# Plotting geom_smooth
df %>%
mutate(date = ymd(date)) %>%
group_by(category) %>%
mutate(is_max = cases == max(cases)) %>%
ggplot(df, mapping = aes(date, cases, color = category)) +
geom_smooth() +
geom_point(size = 3) +
gghighlight(is_max,
n = 1,
unhighlighted_params = list(alpha = 0),
label_key = cases)
Created on 2022-02-17 by the reprex package (v2.0.1)
Upvotes: 1