Brandon Bertelsen
Brandon Bertelsen

Reputation: 44638

Iterating an R Script as a function of sequential survey questions

The function below works perfectly for my purpose. The display is wonderful. Now my problem is I need to be able to do it again, many times, on other variables that fit other patterns.

In this example, I've output results for "q4a", I would like to be able to do it for sequences of questions that follow patterns like: q4 < a - z > or q < 4 - 10 >< a - z >, automagically.

Is there some way to iterate this such that the specified variable (in this case q4a) changes each time?

Here's my function:

require(reshape) # Using it for melt
require(foreign) # Using it for read.spss

d1 <- read.spss(...) ## Read in SPSS file

attach(d1,warn.conflicts=F) ## Attach SPSS data

q4a_08 <- d1[,grep("q4a_",colnames(d1))] ## Pull in everything matching q4a_X
q4a_08 <- melt(q4a_08) ## restructure data for post-hoc

detach(d1)

q4aaov <- aov(formula=value~variable,data=q4a) ## anova

Thanks in advance!

Upvotes: 0

Views: 474

Answers (2)

hadley
hadley

Reputation: 103898

I would recommend melting the entire dataset and then splitting variable into its component pieces. Then you can more easily use subset to look at (e.g.) just question four: subset(molten, q = 4).

Upvotes: 3

Josh Reich
Josh Reich

Reputation: 6587

Not sure if this is what you are looking for, but to generate the list of questions:

> gsub('^', 'q', gsub(' ', '', 
    apply(expand.grid(1:10,letters),1,
           function(r) paste(r, sep='', collapse='')
         )))
  [1] "q1a"  "q2a"  "q3a"  "q4a"  "q5a"  "q6a"  "q7a"  "q8a"  "q9a"  "q10a"
 [11] "q1b"  "q2b"  "q3b"  "q4b"  "q5b"  "q6b"  "q7b"  "q8b"  "q9b"  "q10b"
 [21] "q1c"  "q2c"  "q3c"  "q4c"  "q5c"  "q6c"  "q7c"  "q8c"  "q9c"  "q10c"
 [31] "q1d"  "q2d"  "q3d"  "q4d"  "q5d"  "q6d"  "q7d"  "q8d"  "q9d"  "q10d"
 [41] "q1e"  "q2e"  "q3e"  "q4e"  "q5e"  "q6e"  "q7e"  "q8e"  "q9e"  "q10e"
 [51] "q1f"  "q2f"  "q3f"  "q4f"  "q5f"  "q6f"  "q7f"  "q8f"  "q9f"  "q10f"
 [61] "q1g"  "q2g"  "q3g"  "q4g"  "q5g"  "q6g"  "q7g"  "q8g"  "q9g"  "q10g"
 [71] "q1h"  "q2h"  "q3h"  "q4h"  "q5h"  "q6h"  "q7h"  "q8h"  "q9h"  "q10h"
 [81] "q1i"  "q2i"  "q3i"  "q4i"  "q5i"  "q6i"  "q7i"  "q8i"  "q9i"  "q10i"
 [91] "q1j"  "q2j"  "q3j"  "q4j"  "q5j"  "q6j"  "q7j"  "q8j"  "q9j"  "q10j"
 ...

And then you turn your inner part of the analysis into a function that takes the question prefix as a parameter:

analyzeQuestion <- function (prefix)
{
  q <- d1[,grep(prefix,colnames(d1))] ## Pull in everything matching q4a_X
  q <- melt(q) ## restructure data for post-hoc

  qaaov <- aov(formula=value~variable,data=q4a) ## anova
  return (LTukey(q4aaov,which="",conf.level=0.95)) ## Tukey's post-hoc
}

Now - I'm not sure where your 'q4a' variable is coming from (as used in the aov(..., data=q4a)- so not sure what to do about that bit. But hopefully this helps.

To put the two together you can use sapply() to apply the analyzeQuestion function to each of the prefixes that we automagically generated.

Upvotes: 4

Related Questions