Economist_Ayahuasca
Economist_Ayahuasca

Reputation: 1642

R: subsetting a dataframe according to a condition by id

I have the following data set:

Lines <- "id Observation_code Observation_value
1  A       5
1  A       6
1  B       24
2  C       2
2  D       9
2  A       12
3  V       5
3  E       6
3  C       24
4  B       2
4  D       9
4  C       12"

dat <- read.table(text = Lines, header = TRUE)

I would like to subset the data in a way that I get the whole history of patients with Observation_code == A. In this example, since only id 1 and 2 have observation_code A, they should be the ones left. Note that the all observations for id 1 and 2 should be in the final dataset:

Final <- "id Observation_code Observation_value
1  A       5
1  A       6
1  B       24
2  C       2
2  D       9
2  A       12"

dat_Final <- read.table(text = Final, header = TRUE)

Upvotes: 1

Views: 75

Answers (1)

r2evans
r2evans

Reputation: 161110

base R

ind <- ave(dat$Observation_code == "A", dat$id, FUN = any)
dat[ind,]
#   id Observation_code Observation_value
# 1  1                A                 5
# 2  1                A                 6
# 3  1                B                24
# 4  2                C                 2
# 5  2                D                 9
# 6  2                A                12

or

do.call(rbind, by(dat, dat$id, FUN = function(z) z[any(z$Observation_code == "A"),]))

dplyr

library(dplyr)
dat %>%
  group_by(id) %>%
  filter(any(Observation_code == "A")) %>%
  ungroup()
# # A tibble: 6 x 3
#      id Observation_code Observation_value
#   <int> <chr>                        <int>
# 1     1 A                                5
# 2     1 A                                6
# 3     1 B                               24
# 4     2 C                                2
# 5     2 D                                9
# 6     2 A                               12

Upvotes: 1

Related Questions