TJames
TJames

Reputation: 35

Replacing an object name with characters in a loop

I am new to R from SAS. I would use a global macro variable in SAS to accomplish this but haven't found the means in R yet. I want to figure out how to use a loop, or some other R capability, to simplify my code by replacing an object name, that is character, along with attaching the name to additional text ('.sum'). If I start with the code below:

RED.sum <- aggregate(y ~ x, data = RED, FUN = "mean")
ORANGE.sum <- aggregate(y ~ x, data = ORANGE, FUN = "mean")
YELLOW.sum <- aggregate(y ~ x, data = YELLOW, FUN = "mean")
GREEN.sum <- aggregate(y ~ x, data = GREEN, FUN = "mean")
BLUE.sum <- aggregate(y ~ x, data = BLUE, FUN = "mean")

What do I use that would simplify to one generic line of code:

w.sum <- aggregate(y ~ x, data = w, FUN = "mean")

and cycle through the data names (RED, ORANGE, YELLOW, GREEN, BLUE) assigning the value to 'w'?

Upvotes: 1

Views: 111

Answers (1)

David Robinson
David Robinson

Reputation: 78590

You don't want to have these as separate variables (See here: keep data out of your variable names).

One option is to keep them in a list, and apply the same function to each with lapply:

lst <- list(RED, ORANGE, YELLOW, GREEN, BLUE)

sums <- lapply(lst, function(w) aggregate(y ~ x, data = w, FUN = "mean"))

However, if the datasets are otherwise similar, you should probably instead combine them into one table with a color column. For example:

combined <- rbind(cbind(RED, color = "Red"),
                  cbind(ORANGE, color = "Orange"),
                  cbind(YELLOW, color = "Yellow"))

aggregate(y ~ x + color, data = combined, FUN = "mean")

An alternative for this last operation (that happens to be a lot faster on large datasets) is to use group_by and summarize from the dplyr package:

library(dplyr)
combined %>%
  group_by(x, color) %>%
  summarize(y = mean(y))

Upvotes: 1

Related Questions