alex
alex

Reputation: 345

Calculate a mean, by a condition, within a factor [r]

I'm looking to calculate the simple mean of an outcome variable, but only for the outcome associated with the maximal instance of another running variable, grouped by factors.

Of course, the calculated statistic could be substituted for any other function, and the evaluation within the group could be any other function.

library(data.table) #1.9.5
dt <- data.table(name   = rep(LETTERS[1:7], each = 3),
                 target = rep(c(0,1,2), 7),
                 filter = 1:21) 
dt

##    name target filter
## 1:    A      0      1
## 2:    A      1      2
## 3:    A      2      3
## 4:    B      0      4
## 5:    B      1      5
## 6:    B      2      6
## 7:    C      0      7

With this frame, the desired output should return a mean value for target that meets the criteria of exactly 2.

Something like:

dt[ , .(mFilter = which.max(filter),
        target = target), by = name][ , 
      mean(target), by = c("name", "mFilter")]

... seems close, but isn't hitting it quite right.

The solution should return:

##    name   V1 
## 1:    A    2
## 2:    B    2
## 3:  ...

Upvotes: 4

Views: 946

Answers (1)

David Robinson
David Robinson

Reputation: 78590

You could do this with:

dt[, .(meantarget = mean(target[filter == max(filter)])), by = name]
#    name meantarget
# 1:    A      2
# 2:    B      2
# 3:    C      2
# 4:    D      2
# 5:    E      2
# 6:    F      2
# 7:    G      2

Upvotes: 4

Related Questions