Reputation: 674
I have this dataset:
case_number <- c("1", "1", "2", "2", "2", "3", "3")
type <- c("STD", "STD2", "STD", "STD2", "STD", "STD", "STD2")
date <- as.Date(c('2008-11-1','2009-3-25','2010-3-14','2010-10-14',
'2010-11-14', '2015-3-14', '2015-4-15'))
data <- data.frame(case_number,type, date)
I was wondering how would I extract the unique case_number
that contains "STD2"
as the last type sorted by date?
In this case it would be 1 and 3 since they have STD2 as their last type sorted by date.
Upvotes: 0
Views: 43
Reputation: 28826
We can use data.table
:
library(data.table)
setDT(data)[
setDT(data)[,.I[date == max(date)], by = case_number]$V1][
type %in% "STD2"]
#> case_number type date
#> 1: 1 STD2 2009-03-25
#> 2: 3 STD2 2015-04-15
Upvotes: 2
Reputation: 28675
If you just want the numbers, you can use tapply (assuming your data is sorted by date already within each case_number)
case_nums <- with(data, tapply(type == 'STD2', case_number, tail, 1))
names(case_nums)[case_nums]
# [1] "1" "3"
If you want to subset to those case_numbers, you can use dplyr
library(dplyr)
data %>%
group_by(case_number) %>%
filter(type[which.max(date)] == 'STD2')
# # A tibble: 4 x 3
# # Groups: case_number [2]
# case_number type date
# <fct> <fct> <date>
# 1 1 STD 2008-11-01
# 2 1 STD2 2009-03-25
# 3 3 STD 2015-03-14
# 4 3 STD2 2015-04-15
Upvotes: 2