alexmulo
alexmulo

Reputation: 791

Apply standard deviation to a Data Frame Split by Factors

I'm trying to apply the sd function to my data frame but it does not work:

sdsd <- by(nowna[, 1:16], nowna$stamm, sd)
Error in FUN(X[[1L]], ...) : could not find function "FUN"

do you have any idea why?

Thanks a lot.

Upvotes: 2

Views: 4109

Answers (4)

BrodieG
BrodieG

Reputation: 52677

You almost certainly have an object assigned to sd. Notice how I recreate your error by assigning a value to the sd variable below:

by(warpbreaks[, 1], warpbreaks$wool, sd)
warpbreaks$wool: A
# [1] 15.85143
# ------------------------------------------------------ 
#   warpbreaks$wool: B
# [1] 9.300921
sd <- 5
by(warpbreaks[, 1], warpbreaks$wool, sd)
# Error in FUN(X[[1L]], ...) : could not find function "FUN"
rm(sd)
by(warpbreaks[, 1], warpbreaks$wool, sd)
# warpbreaks$wool: A
# [1] 15.85143
# ------------------------------------------------------ 
#   warpbreaks$wool: B
# [1] 9.300921

You need to rm(sd)

Upvotes: 1

lukeA
lukeA

Reputation: 54247

sd(nowna[,1:16]) probably won't work. This will work:

apply(nowna[,1:16], 2, function(x) by(x, nowna$stamm, sd))

Upvotes: 0

Sven Hohenstein
Sven Hohenstein

Reputation: 81713

If you want to calculate the standard deviation of multiple columns, you can use aggregate:

aggregate(nowna[1:16], list(nowna$stamm), sd)

Upvotes: 1

Prasanna Nandakumar
Prasanna Nandakumar

Reputation: 4335

library(plyr)
dt <- data.frame(age=rchisq(20,10),group=sample(1:2,20,rep=T))

dt
age group
1   9.908015     2
2  11.415043     2
3   7.849433     2
4   8.850696     2
5   6.194783     2
6  11.111339     2
7   9.789127     2
8  10.844352     1
9   8.686503     2
10 21.579142     2
11 11.750417     1
12  3.719226     1
13 12.086820     1
14 13.562351     1
15  4.636543     2
16 12.648083     1
17 10.780387     2
18 10.651318     2
19  5.976533     1
20 13.546345     2

ddply(dt,~group,summarise,mean=mean(age),sd=sd(age))
group     mean       sd
1     1 10.08397 3.728750
2     2 10.38451 4.082198

another one line variant using new package data.table.

dtf <- data.frame(age=rchisq(100000,10),group=factor(sample(1:10,100000,rep=T)))
dt <- data.table(dt)
dt[,list(mean=mean(age),sd=sd(age)),by=group]

Using Aggregate Function

aggregate(dt$age, by=list(dt$group), FUN=sd)
  Group.1        x
1       1 3.728750
2       2 4.082198

Upvotes: 5

Related Questions