Reputation: 5059
I want to aggregate a data frame by a certain group and operation
data
> df <- data.frame(replicate(9, 1:4))
X1 X2 X3 X4 X5 X6 X7 X8 X9
1 1 1 1 1 1 1 1 1 1
2 2 2 2 2 2 2 2 2 2
3 3 3 3 3 3 3 3 3 3
4 4 4 4 4 4 4 4 4 4
aggregation
> aggregate(df[,2], list(df[,1]), mean)
Group.1 x
1 1 1
2 2 2
3 3 3
4 4 4
The above aggregation works, which is great. However instead of mean
, in place of that I need to use combination of functions like mean*sd/length^2
. Should we be using something other than aggregate here ?
Upvotes: 0
Views: 1317
Reputation: 887851
Here is an option with data.table
library(data.table)
setDT(df)[, .(x = mean(X2)*sd(X2)/.N^2), by = X1]
Upvotes: 0
Reputation: 5018
Here's how you would do it with dplyr
:
df %>% group_by(X1) %>% summarize(x = mean(X2)*sd(X2)/length(X2)^2)
Upvotes: 1
Reputation: 9313
I modified your sample data frame in order to get a length and standard deviation for each group (you can't do this with only one data point per group).
> df
X1 X2 X3 X4 X5 X6 X7 X8 X9
1 1 1 1 1 1 1 1 1 1
2 2 2 2 2 2 2 2 2 2
3 3 3 3 3 3 3 3 3 3
4 4 4 4 4 4 4 4 4 4
5 1 1 1 1 1 1 1 1 1
6 2 2 2 2 2 2 2 2 2
7 3 3 3 3 3 3 3 3 3
8 4 4 4 4 4 4 4 4 4
9 1 4 4 4 4 4 4 4 4
10 2 5 5 5 5 5 5 5 5
11 3 6 6 6 6 6 6 6 6
12 4 7 7 7 7 7 7 7 7
13 1 4 4 4 4 4 4 4 4
14 2 5 5 5 5 5 5 5 5
15 3 6 6 6 6 6 6 6 6
16 4 7 7 7 7 7 7 7 7
To aggregate by a more elaborated formula do:
aggregate(df[,2], list(df[,1]), function(x){mean(x)*sd(x)/length(x)^2})
Group.1 x
1 1 0.2706329
2 2 0.3788861
3 3 0.4871393
4 4 0.5953925
If you want to have the same column labels you could do:
aggregate(list(X2 = df[,2]), list(X1 = df[,1]), function(x){mean(x)*sd(x)/length(x)^2})
X1 X2
1 1 0.2706329
2 2 0.3788861
3 3 0.4871393
4 4 0.5953925
(or rename them afterwards with colnames
)
Upvotes: 1