Reputation: 3488
Say we have this data:
dat<-data.frame(id=c(1,1,2,2,3,4,4,5,6,6),Rx=c(1,2,1,2,1,1,1,2,2,2))
id Rx
1 1 1
2 1 2
3 2 1
4 2 2
5 3 1
6 4 1
7 4 1
8 5 2
9 6 2
10 6 2
Where Id is the subject id, and Rx is the treatment they received. So, there are repeated observations and the treatment may or may not be consistent per subject.
I want to be able to summarize how many subjects only received Rx 1, only received Rx 2, and how many received Rx 1 and 2.
I'd prefer a dplyr
solution, but data.table
and base R
would be fine too. I thought something like:
dat %>%
group_by(id,Rx) %>%
unique() %>%
...something
The end result should be something like:
Rx Count
1 2
2 2
Both 2
Thanks!
Upvotes: 8
Views: 304
Reputation: 92292
Here's another generalized solution
library(dplyr)
dat %>%
group_by(id) %>%
summarise(indx = toString(sort(unique(Rx)))) %>%
ungroup() %>%
count(indx)
# Source: local data table [3 x 2]
#
# indx n
# 1 1, 2 2
# 2 1 2
# 3 2 2
With data.table
, similarly
library(data.table)
setDT(dat)[, .(indx = toString(sort(unique(Rx)))), id][ , .N, indx]
Upvotes: 5
Reputation: 9123
This solution does not generalize well to more than 2 treatments:
library(dplyr)
dat %>%
distinct(id, Rx) %>%
group_by(id) %>%
mutate(
trt1 = setequal(1, Rx), # change due to comment from @Marat Talipov
trt2 = setequal(2, Rx),
both = setequal(1:2, Rx)
) %>%
ungroup() %>%
distinct(id) %>%
summarise_each(funs(sum), trt1:both)
This solution is shorter and does generalize to more than one treatment:
library(stringr)
dat %>%
group_by(id) %>%
mutate(
rx_list = str_c(sort(unique(Rx)), collapse = ",")
) %>%
distinct(id) %>%
count(rx_list)
Upvotes: 3
Reputation: 24480
Not exactly the output you have indicated, but it's base R, one-liner and general:
table(do.call(function(...) paste(...,sep="_"),as.data.frame(table(dat)>0)))
#FALSE_TRUE TRUE_FALSE TRUE_TRUE
2 2 2
If the treatments are more then two, you have indicated all the possible combinations.
Upvotes: 2