Reputation: 83
My dataset it as follows:
ID Type
55523 Ref
55523 Ref
19999 Ref
19999 View
55523 Ref
33333 View
33333 View
33333 Ref
11111 Ref
I would like a new dataframe that identifies ID's that have only Ref Type, and none other. I would like to retain the duplicates. The result would be as follows:
ID Type
55523 Ref
55523 Ref
55523 Ref
11111 Ref
Thank you in advance.
Upvotes: 0
Views: 84
Reputation: 389335
Select ID's where all
the Type
values is 'Ref'
.
This can be done in base R :
subset(df, ave(Type == 'Ref', ID, FUN = all))
# ID Type
#1 55523 Ref
#2 55523 Ref
#5 55523 Ref
#9 11111 Ref
dplyr
:
library(dplyr)
df %>% group_by(ID) %>% filter(all(Type == 'Ref'))
and data.table
:
library(data.table)
setDT(df)[, .SD[all(Type == 'Ref')], ID]
Upvotes: 1
Reputation: 607
df = data.frame(
"ID" = c(55523, 55523, 19999, 19999, 55523, 33333, 33333, 33333, 11111),
"Type" = c("Ref", "Ref", "Ref", "View", "Ref", "View", "View", "Ref", "Ref")
)
ref = df[df$Type == "Ref",]
other = df[df$Type != "Ref",]
ref[!ref$ID %in% other$ID,]
# ID Type
# 1 55523 Ref
# 2 55523 Ref
# 5 55523 Ref
# 9 11111 Ref
Upvotes: 0
Reputation: 680
library(dplyr)
library(tibble)
tb <- tribble(~ID, ~Type,
55523, "Ref",
55523, "Ref",
19999, "Ref",
19999, "View",
55523, "Ref",
33333, "View",
33333, "View",
33333, "Ref",
11111, "Ref")
a <- tb %>% count(ID, Type) %>% group_by(ID) %>% count(ID) %>% filter(n == 1)
b <- tb %>% semi_join(a, by = c("ID" = "ID"))
b
# A tibble: 4 x 2
ID Type
<dbl> <chr>
1 55523 Ref
2 55523 Ref
3 55523 Ref
4 11111 Ref
Upvotes: 0