M. Beausoleil
M. Beausoleil

Reputation: 3557

dplyr and for loop in r

So here is the problem: I want to use a for loop in my R code to summarize different columns.

As an example, here what it could look like:

all.columns<-c("column4","column5","column6","column7")
for (i in 1:4) {  
df%>%
 group_by(column3)%>%
 summarise(Mean=mean(all.columns[i]),
           Max=max(all.columns[i]))
} 

Where df is a data frame, column3 could be a group by Year variable, and columns 5 to 7 the ones that I want to check repeatedly with the same code.

Do you know how to execute this with dplyr ? If you an alternative without dplyr, I'd like to hear about it.

I've tried to put the character name of the column, but it's not working...

Upvotes: 4

Views: 10330

Answers (2)

Andrew Taylor
Andrew Taylor

Reputation: 3488

How about this:

Fake data:

df <- data.frame(column3=rep(letters[1:2], 10), 
                 column4=rnorm(20),
                 column5=rnorm(20),
                 column6=rnorm(20),
                 column7=rnorm(20))

dplyr solution:

library(dplyr)
df %>% 
  group_by(column3) %>% 
  summarise_each(funs(mean, max), column4:column7)

Output:

Source: local data frame [2 x 9]

  column3 column4_mean column5_mean column6_mean column7_mean column4_max column5_max
1       a     0.186458   0.02662053  -0.00874544    0.3327999    1.563171    2.416697
2       b     0.336329  -0.08868817   0.31777871    0.1934266    1.263437    1.142430
Variables not shown: column6_max (dbl), column7_max (dbl)

Upvotes: 6

Se&#241;or O
Se&#241;or O

Reputation: 17432

This doesn't work because you're calling column names as if they're objects when you have them stored as characters.

I know this can be done with data.table:

dt = data.table(df)
dt[, lapply(.SD, function(x) data.table(mean(x), max(x))),
    by = column3, .SDcols = all.columns]

Upvotes: 0

Related Questions