Jackson
Jackson

Reputation: 63

Looping Multiple Variables in R

I am new to R. I have a data frame with firm level data such as revenue, profits and costs. I would need to loop through 3 variables - revenue, profit and costs over this code:

datagroup %>% group_by(treat) %>% summarise(n = n(), mean = mean(profit), std_error = sd(profit) / sqrt(n))

Basically, I would run the code for revenue and costs by replacing the variable profit. Could you assist? I tried for loops but to no avail.

Upvotes: 2

Views: 128

Answers (2)

Parfait
Parfait

Reputation: 107587

Since you are new to R, consider base R for multiple aggregate functions on multiple numeric columns via a cbind + aggregate + do.call:

do.call(data.frame, 
   aggregate(cbind(revenue, cost, profit) ~ treat, 
             datagroup, 
             function(x) c(n = length(x), 
                           mean = mean(x), 
                           std_error = sd(x) / sqrt(length(x))
                          )
   )
)

Upvotes: 0

akrun
akrun

Reputation: 887118

We can do this in a loop with the column name as string, then convert it to symbol, evaluate (!!) and get the mean

library(tidyverse)
c("revenue", "costs") %>%
   map(~ datagroup %>% 
         group_by(treat) %>%
          summarise(n = n(), 
          !! str_c("mean_", .x) := mean(!! rlang::sym(.x)), # convert to symbol 
         !! str_c("std_error_", .x) := sd(!! rlang::sym(.x)) / sqrt(n)))

We can also do this with summarise_at

c("revenue", "costs") %>%
   map(~ datagroup %>% 
         group_by(treat) %>%
         group_by(n = n(), add = TRUE) %>%             
          summarise_at(vars(.x), 
              list(mean = ~ mean(.x),
                   std_error = ~ sd(.x)/sqrt(first(n)))))

The output will be a list of data.frames

Upvotes: 2

Related Questions