Reputation: 83

Identify observations conditionally

My dataset it as follows:

ID     Type
55523  Ref
55523  Ref
19999  Ref
19999  View
55523  Ref
33333  View
33333  View
33333  Ref
11111  Ref

I would like a new dataframe that identifies ID's that have only Ref Type, and none other. I would like to retain the duplicates. The result would be as follows:

ID     Type
55523  Ref
55523  Ref
55523  Ref
11111  Ref

Thank you in advance.

Upvotes: 0

Answers (3)

Ronak Shah

Reputation: 389335

Select ID's where all the Type values is 'Ref'.

This can be done in base R :

subset(df, ave(Type == 'Ref', ID, FUN = all))

#     ID Type
#1 55523  Ref
#2 55523  Ref
#5 55523  Ref
#9 11111  Ref

dplyr :

library(dplyr)
df %>% group_by(ID) %>% filter(all(Type == 'Ref'))

and data.table :

library(data.table)
setDT(df)[, .SD[all(Type == 'Ref')], ID]

Upvotes: 1

Colombo

Reputation: 607

df = data.frame(
    "ID" = c(55523, 55523, 19999, 19999, 55523, 33333, 33333, 33333, 11111),
    "Type" = c("Ref", "Ref", "Ref", "View", "Ref", "View", "View", "Ref", "Ref")
    )

ref = df[df$Type == "Ref",]
other = df[df$Type != "Ref",]

ref[!ref$ID %in% other$ID,]
#      ID Type
# 1 55523  Ref
# 2 55523  Ref
# 5 55523  Ref
# 9 11111  Ref

Upvotes: 0

nyk

Reputation: 680

library(dplyr)
library(tibble)

tb <- tribble(~ID, ~Type,
55523,  "Ref",
55523,  "Ref",
19999,  "Ref",
19999,  "View",
55523,  "Ref",
33333,  "View",
33333,  "View",
33333,  "Ref",
11111,  "Ref")

a <- tb %>% count(ID, Type) %>% group_by(ID) %>% count(ID) %>% filter(n == 1)

b <- tb %>% semi_join(a, by = c("ID" = "ID"))
b 
# A tibble: 4 x 2
     ID Type 
  <dbl> <chr>
1 55523 Ref  
2 55523 Ref  
3 55523 Ref  
4 11111 Ref

Upvotes: 0

Identify observations conditionally

Answers (3)

Related Questions