Eelco
Eelco

Reputation: 25

Loop through variable names in R

I have a potentially very stupid question, but can't seem to find a solution easily. And i'm pretty new to R, so please forgive my ignorance.

I'm looking for a way to loop through all variables in my dataframe. For instance, to make two-way tables of all variables compared to one specific variable (say, Sex or Educational level). I used to work with Stata, but since R is free, I am now supposed to work with R (I heard there are a plethora of other benefits to working with R as well, so I am very willing to learn :)).

Say, I have 20 variables, of which 15 are answers from a survey and 5 are demographic variables. I would like to see how different answers compare to differences in demographics.

Normally I would tackle the problem above in Stata with something simple as:

for i = 1 to 5 {
    for j = 1 to 3 {
        tab Sex Var`i'_`j', chi2
    }
}

making 15 tables, for the variables Var1_1 to Var5_3 vs Sex, and giving a Pearson chi2 statistic.

So, I tried what I thought was the same for R:

for (i in 1:5) {
  for (j in 1:3){
  print(table(chisq.test(paste(df$Sex, "df$Var",i,"_",j,sep=""))))    
  }
}

but this doesn't work.

Can anyone please point me in the right direction as how to solve this? Any help is highly appreciated!

Upvotes: 1

Views: 660

Answers (1)

Yuriy Barvinchenko
Yuriy Barvinchenko

Reputation: 1595

Let's pretend that df is your data and first 15 columns are answers. In this case you can use this

lapply(df[,1:15], function(x) {chisq.test(x, df$Sex)}) 

Upvotes: 1

Related Questions