Reputation: 1166
With this dataframe:
table <- "
trt rep ss d1 d4 d5 d6 d7
1 1 1 0 0 0 0 0
1 1 2 0 0 0 0 0
1 1 3 0 0 1 2 2
1 2 1 0 0 1 3 6
1 2 2 0 1 1 2 4
1 2 3 0 0 0 1 1
1 3 1 0 0 0 0 0
1 3 2 0 0 0 0 0
1 3 3 0 1 1 1 1
2 1 1 0 0 0 0 0
2 1 2 0 0 0 1 1
2 1 3 0 0 0 1 1
2 2 1 0 0 0 0 0
2 2 2 0 0 0 0 0
2 2 3 0 0 0 0 1
2 3 1 0 0 0 0 0
2 3 2 0 0 0 1 3
2 3 3 . . . . .
"
d <- read.table(text=table, header = TRUE, check.names = F, na.strings = ".")
I'd like to obtain a dataframe with the proportion of positives values by trt for every day (d1,d4,..., d7) such as this table:
# trt d1 d4 d5 d6 d7
# 1 0.00 0.22 0.44 0.56 0.56
# 2 0.00 0.00 0.00 0.38 0.50
Upvotes: 1
Views: 293
Reputation: 7293
Using data.table
, something like this:
library(data.table)
d <- data.table(d)
d[,lapply(.SD,function(x) sum(x>0,na.rm=T)/sum(!is.na(x))),
.SDcols=grep("^d",names(d),val=T),
by=trt]
trt d1 d4 d5 d6 d7
1: 1 0 0.2222222 0.4444444 0.5555556 0.5555556
2: 2 0 0.0000000 0.0000000 0.3750000 0.5000000
Upvotes: 4
Reputation: 66819
Thanks to @A.Webb, here's a way in base R:
aggregate(d[,4:8]>0~d$trt, FUN = mean)
# d$trt d1 d4 d5 d6 d7
# 1 1 0 0.2222222 0.4444444 0.5555556 0.5555556
# 2 2 0 0.0000000 0.0000000 0.3750000 0.5000000
Here was my original idea:
rowsum(+(d[-(1:3)] > 0), d$trt, na.rm=TRUE) /
rowsum(+!is.na(d[-(1:3)]), d$trt, na.rm=TRUE)
The +
is there because rowsum
only works with numbers, and not with logicals.
Upvotes: 6
Reputation: 887541
We can use dplyr
library(dplyr)
d %>%
group_by(trt) %>%
summarise_each( funs(round(mean(.>0, na.rm=TRUE),2)), d1:d7)
# trt d1 d4 d5 d6 d7
# (int) (dbl) (dbl) (dbl) (dbl) (dbl)
#1 1 0 0.22 0.44 0.56 0.56
#2 2 0 0.00 0.00 0.38 0.50
Upvotes: 3