Per
Per

Reputation: 13

for loop in r with variable names

I'm trying to repeat a complex syntax for a set of variables. Essentially, using a data set like:

df <- data.frame( X=1:10, Y=6:15, Z=11:20)

I'd like to replace syntax like:

mean(df$X)
mean(df$Y)
mean(df$Z)

with a Loop like:

for (n in c("X", "Y", "Z")) {mean(df$n)}

However, this rather Stata-like programming does not work in R. It seems like the loop writes df$"X" instead of df$X. Is there a simple work around?

UPDATE: Instead of computing the mean I have a more complex function where I repeatedly need to access variable names. My question is therefor not about computing means but using the loop function.

Upvotes: 1

Views: 443

Answers (2)

Prem
Prem

Reputation: 11985

You can use summarise_at along with bind_cols.

In below code I have applied mean on column X, Y & max on column Y, Z. Similarly you can apply your own function on multiple sets of different columns likewise.

library(dplyr)

df %>%
  summarise_at(vars(X, Y), funs(Mean = mean)) %>%
  bind_cols(df %>%
              summarise_at(vars(Y, Z), funs(Max = max)))

which gives

  X_Mean Y_Mean Y_Max Z_Max
1    5.5   10.5    15    20


Sample data:

df <- structure(list(X = 1:10, Y = 6:15, Z = 11:20), .Names = c("X", 
"Y", "Z"), row.names = c(NA, -10L), class = "data.frame")

Upvotes: 1

Dan
Dan

Reputation: 12084

This does the job.

for(n in c("X", "Y", "Z")) {mean(df[, n])}

To see the output, wrap mean in print():

# [1] 5.5
# [1] 10.5
# [1] 15.5

I'd still favour @Prem's solution, but then I don't know exactly what you're doing...

Upvotes: 0

Related Questions