Xin
Xin

Reputation: 674

Extract values that contains specified values in different column

I have this dataset:

case_number <- c("1", "1", "2", "2", "2", "3", "3")
type <- c("STD", "STD2", "STD", "STD2", "STD", "STD", "STD2")
date <- as.Date(c('2008-11-1','2009-3-25','2010-3-14','2010-10-14',
                  '2010-11-14', '2015-3-14', '2015-4-15'))

data <- data.frame(case_number,type, date)

I was wondering how would I extract the unique case_number that contains "STD2" as the last type sorted by date?

In this case it would be 1 and 3 since they have STD2 as their last type sorted by date.

Upvotes: 0

Views: 43

Answers (2)

M--
M--

Reputation: 28826

We can use data.table:

library(data.table)

setDT(data)[
  setDT(data)[,.I[date == max(date)], by = case_number]$V1][
    type %in% "STD2"]

#>    case_number type       date
#> 1:           1 STD2 2009-03-25
#> 2:           3 STD2 2015-04-15

Upvotes: 2

IceCreamToucan
IceCreamToucan

Reputation: 28675

If you just want the numbers, you can use tapply (assuming your data is sorted by date already within each case_number)

case_nums <- with(data, tapply(type == 'STD2', case_number, tail, 1))

names(case_nums)[case_nums]
# [1] "1" "3"

If you want to subset to those case_numbers, you can use dplyr

library(dplyr)

data %>% 
  group_by(case_number) %>% 
  filter(type[which.max(date)] == 'STD2')
# # A tibble: 4 x 3
# # Groups:   case_number [2]
#   case_number type  date      
#   <fct>       <fct> <date>    
# 1 1           STD   2008-11-01
# 2 1           STD2  2009-03-25
# 3 3           STD   2015-03-14
# 4 3           STD2  2015-04-15

Upvotes: 2

Related Questions