Reputation: 1460
I'm trying to bring in the standard deviation for each unique factor grouping in my data. I've researched techniques using the data.table package and the plyr package and haven't had any luck. Here is a basic example of what I'm trying to accomplish.
Group Hours
120 45
120 60
120 54
121 33
121 55
121 40
I'm trying to turn the above into:
Group Hours SD
120 45 7.343
120 60 7.343
120 54 7.343
121 33 9.833
121 55 9.833
121 40 9.833
Upvotes: 0
Views: 92
Reputation: 1460
Thank you, David, for your detailed response! I've used data.table to write what I'm looking for. Here is a snippet of my final script that I wrote using David's answer.
PayrollHoursSD <- as.data.table(PayrollHours2)[, SD := sd(TOTAL.HOURS), by = COMBO]
head(PayrollHoursSD)
# COMBO PAY.END.DATE TOTAL.HOURS SD
# 1: 1-2 10-06 42561.78 4297.287
# 2: 1-2 10-13 42177.88 4297.287
# 3: 1-2 10-20 44691.23 4297.287
# 4: 1-2 10-27 42709.28 4297.287
# 5: 1-2 11-03 44876.25 4297.287
# 6: 1-2 11-10 40582.44 4297.287
Upvotes: 0
Reputation: 92302
Base solution (assuming your data called df
)
transform(df, SD = ave(Hours, Group, FUN = sd))
data.table
solution
library(data.table)
setDT(df)[, SD := sd(Hours), by = Group]
dplyr
solution
library(dplyr)
df %>%
group_by(Group) %>%
mutate(SD = sd(Hours))
And here's a plyr
solution (my first ever) as you asked for it
library(plyr)
ddply(df, .(Group), mutate, SD = sd(Hours))
(It is better to avoid having both plyr
and dplyr
loaded at the same time)
Upvotes: 4