Reputation: 8818
Consider the following dataframe:
df <- data.frame(group = c("group1", "group1", "group2", "group2", "group2", "group3"), factor = paste("factor", 1:6, sep=""), vol = seq(from = 0.02, length.out = 6, by = 0.02))
The first column defines a top-level group for each factor in the second column. The third column is the value of standard deviation for each factor.
I would like to generate a summary table with the groups only, and the standard deviation for each group defined as:
If group1 contrains factors f1 and f2, and vol(f1) and vol(f2) are the standard deviations for factors f1 and f2 respectively, then the standard deviation of group1 is:
std(group1) = sqrt[vol(f1)^2 + vol(f2)^2]
Is there any easy way of creating the summary table, where vol
of each group is computed using this custom function?
Any help would be appreciated! Thank you.
Upvotes: 3
Views: 8109
Reputation: 80
I can recommend aggregate()
from the basic package stats
, though you have to define a new function first.
ss<-function(x){sqrt(sum(x^2))}
aggregate(vol~group,data=df,FUN=ss)
Upvotes: 1
Reputation: 44648
A base solution for good measure.
by(df,df$group,function(x) sqrt(sum(x$vol^2)))
If you need it to look prettier:
as.table(df,df$group,function(x) sqrt(sum(x$vol^2))))
df$group
group1 group2 group3
0.04472136 0.14142136 0.12000000
Upvotes: 5
Reputation: 2300
May I propose a solution using ddply
function:
# require(plyr)
ddply(df, .(group), summarize, std = sqrt(sum(vol^2)))
# group std
# 1 group1 0.04472136
# 2 group2 0.14142136
# 3 group3 0.12000000
Upvotes: 4
Reputation: 1453
Using the amazing new dplyr
package, I think this is what you're looking for:
require(dplyr)
df <- data.frame(group = c("group1", "group1", "group2", "group2", "group2", "group3"),
factor = paste("factor", 1:6, sep=""),
vol = seq(from = 0.02, length.out = 6, by = 0.02))
df %.% group_by(group) %.% summarise(grp_std=sqrt(sum(vol^2)))
# Source: local data frame [3 x 2]
# group std_dev
# 1 group1 0.04472136
# 2 group2 0.14142136
# 3 group3 0.12000000
The chaining syntax using %.%
takes a bit of getting used to, but it becomes very intuitive. Alternative syntax:
df_grouped <- group_by(df, group)
summarise(df_grouped, grp_std=sqrt(sum(vol^2)))
Upvotes: 3