Jibril
Jibril

Reputation: 1037

Mean of means with a subset

Imagine a data table

ID    Score
1     10
1     13
1     12
2     10
3     6
3     6

The average for user 1 is 11.67. The average for user 2 is 10. The average for user 3 is 6.

The UniqAverage I am looking for would be ((11.67+10+6)/3) =9.223`

For replication purposes feel free to use...

library(data.table)
df = data.frame( ID=c(1,1,1,2,3,3), Score1=c(10,13,12,10,6,6) ) 
dt = data.table(df)

A previous question lead to me learning I could do

dt[, mean(Score1), ID][,mean(V1)]

To get this average of averages by ID.

However, what if I only wanted the mean of scores > 10?

So I'd wind up with

ID    V1
1     12.5
2     0
3     0

So my final result would be 4.167

Attempts to use which failed.

dt[, mean( which(Score1) > 10 ), ID][,mean(V1)]

Upvotes: 2

Views: 80

Answers (2)

akrun
akrun

Reputation: 887291

We can join on a dataset with unique 'ID' after getting the mean of 'Score1' by 'ID' where the "Score1" is greater than 10.

dt1 <- dt[Score1>10, .(V1= mean(Score1)), by = ID
          ][data.table(ID= unique(dt$ID)), on = "ID"][is.na(V1), V1 :=0][]
dt1
#   ID   V1
#1:  1 12.5
#2:  2  0.0
#3:  3  0.0

dt1[, mean(V1)]
#[1] 4.166667

Upvotes: 2

Dan Lewer
Dan Lewer

Reputation: 956

Here's a solution with base R and aggregate:

x <- aggregate(df, list(df$ID), function(x) mean(ifelse(x > 10, x, NA), na.rm = T))[,3]
mean(ifelse(is.na(x), 0, x))

Upvotes: 0

Related Questions