Reputation: 41
I would like to obtain the significance value for a correlation computed on complex survey data in R! As far as I know, correlations are calculated as follows with the survey package:
var <- svyvar(~var1+var2, design, na.rm=TRUE)
cov2cor(as.matrix(var))
This only provides the correlation coefficients, though. Hence my question: How can I get the significance value for such a correlation? I know that this might be done by passing the sig.stats = TRUE
argument to the svycor
function in the jtools package:
svycor(~var1+var2,
design,
na.rm = TRUE,
sig.stats = TRUE,
bootn = 1000)
This, however, relies on the wtd.cor
function in the weights package - which I somehow have troubles installing, due to some issue with the gdata dependency. Does anyone have a workaround for this? Thanks a lot in advance!
Upvotes: 1
Views: 258
Reputation: 2765
The easiest way to get this significance test is
summary(svyglm(var2~var1, design=design, na.rm=TRUE))
The null hypothesis that the correlation is zero is exactly the null hypothesis that there is no linear component to the relationship between the variables. This gives you very close to the test you could do if svycontrast
worked on svyvar
output
That is, assume
svycontrast.svyvar<-function(stats, ...){
s<-as.vector(as.matrix(stats))
nms<-as.vector(outer(rownames(stats),colnames(stats),paste,sep=":"))
v<-vcov(stats)
names(s)<-nms
dimnames(v)<-list(nms,nms)
attr(s,"var")<-v
attr(s,"statistic")<-"variance"
class(s)<-"svystat"
svycontrast(s,...)
}
then with the built-in dclus1
data set as an example
> summary(svyglm(api00~ell,design=dclus1))
Call:
svyglm(formula = api00 ~ ell, design = dclus1)
Survey design:
svydesign(id = ~dnum, fpc = ~fpc, data = apiclus1)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 746.926 26.900 27.77 5.85e-13 ***
ell -3.721 0.467 -7.97 2.33e-06 ***
and
> b<-svyvar(~api00+ell,dclus1)
> svycontrast(b, quote(`api00:ell`/sqrt(`api00:api00`*`ell:ell`)))
nlcon SE
contrast -0.5941 0.0778
> -0.5941/ 0.0778
[1] -7.636247
for basically the same z-statistic that you could then turn into a p-value.
This is better than using the weights
package, because weights::wtd.cor
doesn't account for clustering or stratification, only for weights.
Upvotes: 0