Reputation: 13113
I would like to find unique combinations of missing observations in matrix rows by a group variable.
I can do so with the example data set by using the sequence of subset, cbind and rbind commands shown to generate the matrix u3.
However, I suspect there is a much better way that would not involve 'manually' subsetting the matrix for each level of the group variable. I have tried using the tapply command at the bottom, but cannot get it to work.
Thank you sincerely for any suggestions.
my.data <- matrix(c(
1, 0, 1, 1, 1,
NA, 1, 1, 0, 1,
NA, 0, 0, 0, 1,
NA, 1,NA, 1, 1,
NA, 1, 1, 1, 1,
0, 0, 1, 0, 1,
NA, 0, 0, 0, 1,
0,NA,NA,NA, 1,
1,NA,NA,NA, 1,
1, 1, 1, 1, 1,
NA, 1, 1, 0, 1,
1, 0, 1, 1, 2,
1, 1, NA, 0, 2,
NA, NA, NA, 0, 2,
NA, NA,NA, 1, 2,
1, 1, 1, NA, 2,
0, 0, 1, 0, 2,
NA, 0, 0, 0, 2,
0,NA,NA,NA, 2,
1,NA,NA,NA, 2,
1, 1, 1, 1, 2,
0, 1, 1, NA, 2
),
nrow=22, byrow=T,
dimnames = list(NULL, c("c1","c2","c3","c4","my.group")))
my.data <- as.data.frame(my.data)
my.data
g1 <- subset(my.data, my.data$my.group==1)
u1 <- unique( is.na(g1[1:4]) )
u1 <- cbind(1,u1)
g2 <- subset(my.data, my.data$my.group==2)
u2 <- unique( is.na(g2[1:4]) )
u2 <- cbind(2,u2)
u3 <- rbind(u1,u2)
u3
tapply(my.data[,1:4], my.data$my.group, function(x) {unique(is.na(x), 'rows') } )
Here is the matrix u3:
c1 c2 c3 c4
1 1 0 0 0 0
2 1 1 0 0 0
4 1 1 0 1 0
8 1 0 1 1 1
12 2 0 0 0 0
13 2 0 0 1 0
14 2 1 1 1 0
16 2 0 0 0 1
18 2 1 0 0 0
19 2 0 1 1 1
Upvotes: 0
Views: 1051
Reputation: 56915
You can use the plyr
package for this, it's fantastic for "apply this function to each group"-type applications. In particular, the function ddply
:
library(plyr)
u3 <- ddply(my.data,.(my.group),
function(df)
data.frame(unique(is.na(df[1:4])))
)
Then u3
looks like this:
my.group c1 c2 c3 c4
1 1 FALSE FALSE FALSE FALSE
2 1 TRUE FALSE FALSE FALSE
3 1 TRUE FALSE TRUE FALSE
4 1 FALSE TRUE TRUE TRUE
5 2 FALSE FALSE FALSE FALSE
6 2 FALSE FALSE TRUE FALSE
7 2 TRUE TRUE TRUE FALSE
8 2 FALSE FALSE FALSE TRUE
9 2 TRUE FALSE FALSE FALSE
10 2 FALSE TRUE TRUE TRUE
You could do as.matrix(u3)
to get the numerical matrix.
Upvotes: 2