Reputation: 353
I have a series of variables with the variable names "HPV_x_ALL". The only difference between these names is the x, which is number (e.g., 11, 16, 18, 33). I'd like to use -rowSums- to summarize the values of HPV_x_ALL for each observation, and I tried using * to represent the numbers, but it doesn't work. Thank you!
Update: Hi, I added a reproducible dataset.
structure(list(HPV_16_ALL = c(1L, NA, 0L, 0L, 0L, 0L), HPV_18_ALL = c(0L,
NA, 0L, 0L, 0L, 0L), HPV_33_ALL = c(0L, NA, 0L, 0L, 0L, 0L)), row.names = 40:45, class = "data.frame")
Upvotes: 0
Views: 44
Reputation: 16178
Without a reproducible example, it is difficult to be sure that this answer will be appropriate.
However, starting from this dummy example:
set.seed(123)
df <- data.frame(Var = c(paste0("HPV_",11:15,"_ALL"),paste0("BPV_",11:15,"_ALL")),
Val = sample(1:100,10))
Var Val
1 HPV_11_ALL 31
2 HPV_12_ALL 79
3 HPV_13_ALL 51
4 HPV_14_ALL 14
5 HPV_15_ALL 67
6 BPV_11_ALL 42
7 BPV_12_ALL 50
8 BPV_13_ALL 43
9 BPV_14_ALL 97
10 BPV_15_ALL 25
You can get the rows corresponding to "HPV_xx_ALL" by doing:
grep("HPV_\\d{2}_ALL",df$Var, perl = TRUE)
[1] 1 2 3 4 5
So, you can get the sum of rows corresponding to the pattern you are looking for by doing:
sum(df[grep("HPV_\\d{2}_ALL",df$Var, perl = TRUE),"Val"])
[1] 242
If your pattern HPV_xx_ALL
are columns names, you can do the same by doing:
rowSums(df[,grep("HPV_\\d{2}_ALL", names(df), perl = TRUE)]
Does it answer your question ? If not, please provide a reproducible example of your dataset (see: How to make a great R reproducible example)
Upvotes: 2