Ben
Ben

Reputation: 42313

Aggregate to compute the percentage of non-zero rows per group

What's the simplest way to compute the percentage of rows (1) containing ones and (2) containing zeros, per group?

Here's some small example data:

dat <- structure(list(rs = c(0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0), group = c(3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 
3, 3, 3, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1)), .Names = c("rs", "group"), row.names = c(NA, 
-62L), class = "data.frame")

Here's what I've got so far (don't laugh!):

require(plyr)    
tab <- as.data.frame(table(dat))
dc <- dcast(tab, group ~ rs)
dc <- dc[,-1]
dc[] <- lapply(dc, as.numeric)
data.frame(prop.table(as.matrix(dc), 1))

Which works fine:

         X0         X1
1 1.0000000 0.00000000
2 0.8787879 0.12121212
3 0.9285714 0.07142857

But I'm sure there's a method that requires less typing.

Solutions with plyr and data.table most welcome.

Upvotes: 0

Views: 2021

Answers (1)

Matthew Lundberg
Matthew Lundberg

Reputation: 42679

table almost does what you want. Convert to ratios by dividing each set of values by its sum:

t(apply(table(dat), 2, function(x) x/sum(x)))

## group         0          1
##     1 1.0000000 0.00000000
##     2 0.8787879 0.12121212
##     3 0.9285714 0.07142857

Upvotes: 1

Related Questions