Reputation: 61
I'm trying using the corr() function to calculate weighted ponderations. The way it works is the first argument should be a matrix with two columns corresponding to the two variables whose correlation we wish to calculate and the second a vector of weights to be applied to each pair of observations.
Here is an example.
> head(d)
Shade_tolerance htot
1 4.56 25.0
2 2.73 23.5
3 2.73 21.5
4 3.97 17.0
5 4.00 25.5
6 4.00 23.5
> head(poids)
[1] 5.200440e-07 5.200440e-07 1.445016e-06 1.445016e-06 1.445016e-06 1.445016e-06
> corr(d,poids)
[1] 0.1357279
So I got it and I'm able to use it on my matrix but I would like to compute different correlations according to the levels of a factor. Let's say as if I was using the tapply() function.
> head(d2)
Shade_tolerance htot idp
1 4.56 25.0 19
2 2.73 23.5 19
3 2.73 21.5 19
4 3.97 17.0 18
5 4.00 25.5 18
6 4.00 23.5 18
So my dream would be to do something like this:
tapply(as.matrix(d2[,c(1,2)]), d2$idp, corr)
Except that as you know in tapply() the first element needs to be avector not a matrix.
Would someone have any solution for me?
Thanks a lot for your help.
EDIT: I just realized that I am missing the weights for the weighted correlation in the part of the data frame I showed you. So it would have some how to take both the matrix and the weights according to the levels of the factor.
> head(df)
Shade_tolerance htot idp poids
1 4.56 25.0 19 5.200440e-07
2 2.73 23.5 19 5.200440e-07
3 2.73 21.5 19 1.445016e-06
4 3.97 17.0 19 1.445016e-06
5 4.00 25.5 19 1.445016e-06
6 4.00 23.5 19 1.445016e-06
I hope it is clear.
Upvotes: 3
Views: 2049
Reputation: 118799
If you've a "huge" data.frame, then using data.table
might help:
require(data.table)
dt <- as.data.table(df)
setkey(dt, "idp")
dt[, list(corr = corr(cbind(Shade_tolerance, htot), poids)), by=idp]
# idp corr
# 1: 18 0.9743547
# 2: 19 0.8387363
Upvotes: 2
Reputation: 98449
Here is a solution using function ddply()
from library plyr
.
ddply(df,.(idp),
summarise,kor=corr(cbind(Shade_tolerance, htot),poids))
idp kor
1 18 0.9743547
2 19 0.8387363
Upvotes: 1
Reputation: 121568
Using by
and cbind
,
library(boot)
by(dat,dat$idp,FUN=function(x)corr(cbind(x$Shade_tolerance,x$htot),x$poids))
dat$idp: 18
[1] 0.9743547
---------------------------------------------------------------------------------------
dat$idp: 19
[1] 0.7474093
Upvotes: 0