Find the first value that meet a defined criteria

Question

I have a dataset on Coivd-19 cases and deaths by day and country. I wish to find the date when the first death occured for every country, and the filter away all the preceding days. How would you tackle this problem in R/Tidyverse?

library(readxl)
library(httr)
url <- paste("https://www.ecdc.europa.eu/sites/default/files/documents/COVID-19-geographic-disbtribution-worldwide-",format(Sys.time(), "%Y-%m-%d"), ".xlsx", sep = "")
GET(url, authenticate(":", ":", type="ntlm"), write_disk(tf <- tempfile(fileext = ".xlsx")))
df <- read_excel(tf)

nurandi · Accepted Answer

Try this:

library(dplyr) 
# or library(tidyverse)

df %>%
  arrange(`Countries and territories`, DateRep) %>%
  group_by(`Countries and territories`) %>%
  mutate(Cumulative_Death = cumsum(Deaths)) %>%
  ungroup() %>%
  filter(Cumulative_Death > 0) %>%
  group_by(`Countries and territories`) %>%
  mutate(First_Death_Date = min(DateRep))

It adds new columns Cumulative_Death : sum of death up-to DateRep and First_Death_Date: date when the first death occured for every country

Find the first value that meet a defined criteria

Answers (2)

Related Questions