user3465783
user3465783

Reputation: 13

How to approach loop with increasing variable name in R

My dataset is currently a set of answers to twenty questions with 300 observations.

Each of the questions are labled q1, q2, q3, etc.

Each observation gives a 1 to 10 response.

The code below is what I have. What I want is for the q1 to change when the counter changes in R.

totaltenq1 <- sum(UpdatedQualtrix$tenq1)
totalnineq1 <- sum(UpdatedQualtrix$nineq1)
totaleightq1 <- sum(UpdatedQualtrix$eightq1)
totalsevenq1 <- sum(UpdatedQualtrix$sevenq1)
totalsixq1 <- sum(UpdatedQualtrix$sixq1)
totalfiveq1 <- sum(UpdatedQualtrix$fiveq1)
totalfourq1 <- sum(UpdatedQualtrix$fourq1)
totalthreeq1 <- sum(UpdatedQualtrix$threeq1)
totaltwoq1 <- sum(UpdatedQualtrix$twoq1)
totaloneq1 <- sum(UpdatedQualtrix$oneq1)

totaltenq2 <- sum(UpdatedQualtrix$tenq2)
totalnineq2 <- sum(UpdatedQualtrix$nineq2)
totaleightq2 <- sum(UpdatedQualtrix$eightq2)
totalsevenq2 <- sum(UpdatedQualtrix$sevenq2)
totalsixq2 <- sum(UpdatedQualtrix$sixq2)
totalfiveq2 <- sum(UpdatedQualtrix$fiveq2)
totalfourq2 <- sum(UpdatedQualtrix$fourq2)
totalthreeq2 <- sum(UpdatedQualtrix$threeq2)
totaltwoq2 <- sum(UpdatedQualtrix$twoq2)
totaloneq2 <- sum(UpdatedQualtrix$oneq2)

I would like to have code that is

count = 20

for (i in 1:count){
totaltenq(i) <- sum(UpdatedQualtrix$tenq(i)
totalninq(I) <- sum(UpdatedQuatlrix$nineq(I)
etc
}

That way, when I do it again in the future, I can tell R how many questions it has the next time and it will change it. That way I don't have 10,000 lines of code from copying and pasting my code 20 times.

Upvotes: 0

Views: 237

Answers (1)

MrFlick
MrFlick

Reputation: 206242

I don't think you need any loops at all. It just all depends on how you want to store those value. I'm a big fan of not having more variables than necessary.

Here's some sample data. I'll just make 10 rows (observations) with values 1-5.

set.seed(15)
Q<-3
numbs<-c("one","two","three","four","five","six","seven","eight","nine","ten")
qs<-paste0("q",1:Q)
qnumbs <- outer(numbs, qs, paste0)

UpdatedQualtrix <-data.frame(ID=1:10, 
    matrix(sample(1:5, 10*length(numbs)*Q, replace=T), nrow=10))
colnames(UpdatedQualtrix) <- c("ID",qnumbs)

Now I can sum up each of the columns with

( Qsums<-colSums(UpdatedQualtrix[, qnumbs]) )

# oneq1   twoq1 threeq1  fourq1  fiveq1   sixq1 sevenq1 eightq1  nineq1   tenq1 
#    37      35      29      26      32      39      40      33      40      26 
# oneq2   twoq2 threeq2  fourq2  fiveq2   sixq2 sevenq2 eightq2  nineq2   tenq2 
#    37      31      19      29      25      38      36      35      28      27 
# oneq3   twoq3 threeq3  fourq3  fiveq3   sixq3 sevenq3 eightq3  nineq3   tenq3 
#    37      30      31      31      24      31      29      31      25      41 

And if we want the totals per question we can do

sapply(qs, function(a, b) sum(Qsums[paste0(b,a)]), b=numbs)

#  q1  q2  q3 
# 337 305 310 

Or if we want the counts per response we can do

sapply(numbs, function(a, b) sum(Qsums[paste0(a,b)]), b=qs)

#   one   two three  four  five   six seven eight  nine   ten 
#   111    96    79    86    81   108   105    99    93    94 

You might want to also consider melting your data since it's so structured. You can use the reshape2 library to help. You can do

require(reshape2)

mm <- melt(UpdatedQualtrix, id.vars="ID")
mm <- cbind(mm[,-2], colsplit(mm$variable, "q", c("resp","q")))
mm$resp <- factor(mm$resp, levels=numbs)

to turn your data into a "tall" format so each value has it's own row with a column for ID, value, response and question.

str(mm)

# 'data.frame': 300 obs. of  4 variables:
#  $ ID   : int  1 2 3 4 5 6 7 8 9 10 ...
#  $ value: int  4 1 5 4 2 5 5 2 4 5 ...
#  $ resp : Factor w/ 10 levels "one","two","three",..: 1 1 1 1 1 1 1 1 1 1 ...
#  $ q    : int  1 1 1 1 1 1 1 1 1 1 ...

And then we can more easily do other calculations. Of you want the total scores by question, you could do

aggregate(value~q, mm, sum)
#   q value
# 1 1   337
# 2 2   305
# 3 3   310

If you wanted the average value for each question/response you could do

with(mm, tapply(value, list(q,resp), mean))
#   one two three four five six seven eight nine ten
# 1 3.7 3.5   2.9  2.6  3.2 3.9   4.0   3.3  4.0 2.6
# 2 3.7 3.1   1.9  2.9  2.5 3.8   3.6   3.5  2.8 2.7
# 3 3.7 3.0   3.1  3.1  2.4 3.1   2.9   3.1  2.5 4.1

Upvotes: 2

Related Questions