Reputation: 33
I'm relatively new to R (used to work in Stata before) so sorry if the question is too trivial.
I've a dataframe with variables named in a sequential way that follows the following logic: q12.X.Y where X assumes the values from 1 to 9, and Y from 1 to 5
I need to add together the values of the variables of all the q12.X.Y variables with the Y numbers from 1 to 3 (but NOT those ending with the number 4 or 5)
Ideally I would have written a loop based on the sequential numbers of the variables, namely something like:
df$test <- 0
for(i in 1:9){
for(j in 1:3){
df$test <- df$test+ df$q12.i.j
}
}
That obviously do not work.
I also tried with the command "rowSums" and "subset"
df$test <- rowSums(subset(df,select= ...)
However I find it a bit cumbersome, as the column numbers are not sequential and i do not want to type the name of all the variables.
Any suggestion how to do that?
Upvotes: 1
Views: 53
Reputation: 887501
We can use grep
to get the match
rowSums(df[grep("q12\\.[1-9]\\.[1-3]", names(df))])
or if all the column names are present, then use an exact match by creating the column names with paste
rowSums(df[paste0(rep(paste0("q12.", 1:9, "."), 3), 1:3)])
Upvotes: 1