Pedro Romani
Pedro Romani

Reputation: 51

How to make a loop over multiple columns with the svyby function of the survey package?

I have been trying many ways , but I am not getting to solve the problem. I found here, here and here, but I couldn’t adapt them to my problem.

I would like to pass the combination of two string vectors where each element of 'pop' would be combined with each element of 'territ' and over a subset of the column “enroll” through a numeric vector (“enroll_lines”). So, there are three iterations inside the svyby function I want to do: two over a string vector and one iteration inside a subset numeric vector.

I want a data frame with all the result combinations of the three vectors over the design object “dclus1”.

Thank you in advance for your attention and effort.

data(api)
df <- apiclus1
df$pais <- 0
df$pop_tot <- 1

pop <- c("pop_tot", "stype", "awards")
territ <- c("pais","cname", "dname")
enroll_lines = c(355, 455, 555)

dclus1<-svydesign(id=~dnum, weights=~pw, data=df, fpc=~fpc)

svyloop <- function(vv1, vv2, dsgn, xx) {
  svyby( as.formula( paste0( "~" , vv1)) , by = as.formula( paste0( "~" , vv2)) , subset(dsgn, enroll < xx), svytotal , vartype = 'cv')
}
svyloop(pop, territ, dclus1, enroll_lines)
#Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :  contrasts can be applied only to factors with 2 or more levels

sapply(dclus1, svyloop, pop, territ, enroll_lines)
#Even though keeping just columns with two or more leves, the column "enroll" is not found, as the message below returns:
#Error in subset.default(dsgn, enroll < xx) : object 'enroll' not found

The other way I've tried was to put an "i" of iteration in the function.

jj <- 1:3
svyloop <- function(vv1, vv2,, xx, i) {
  svyby( as.formula( paste0( "~" , vv1[i])) , by = as.formula( paste0( "~" , vv2[i])) , subset(dclus1, enroll < xx[i]), svytotal , vartype = 'cv')
}
svyloop(pop, territ, enroll_lines, jj)
sapply(dclus1, svyloop, pop, territ, enroll_lines)

#Error in `contrasts<-`(`tmp\`, value = contr.funs[1 + isOF[nn]]) : contrasts can be applied only to factors with 2 or more levels

Upvotes: 5

Views: 906

Answers (2)

Thomas Lumley
Thomas Lumley

Reputation: 2765

As an additional note to @Mathdragon's answer, I'd use bquote rather than as.formula for svyloop

svyloop <- function(vv1, vv2, dsgn, xx) {
  eval(bquote(svyby( as.formula( ~.(vv1) , by = ~.(vv2) , subset(.(dsgn), enroll < .(xx)), svytotal , vartype = 'cv')))
}

Upvotes: 1

Mathdragon
Mathdragon

Reputation: 92

The first argument in sapply is looped over and you don't want to iterate over your design dclus1 but over pop, territ and enroll_lines. Your solution can not work, because you dont give a design object to your svytable-function. You can use multiple sapplys and your function works. Simple but unelegant solution:

sapply(pop, 
       function(x) sapply(territ, 
                          function(y) sapply(enroll_lines, function(z) 
                  svyloop(x, y, dclus1, z),
                  simplify = F),
             simplify = F),
         simplify = F)

This way you get a nested list of your tables and can combine them in any way you like.

There are probably far more efficient solutions with mapply but nested sapplys work, too.

Upvotes: 1

Related Questions