potato
potato

Reputation: 105

Return a changed list in R via lapply(), but objects in list not changed

I'm trying to loop through a list of data frames, dropping columns that don't match some condition. I want to change the data frames such that they're missing 1 column essentially. After executing the function, I'm able to change the LIST of data frames, but not the original data frames themselves.

df1 <- data.frame(
                  a = c("John","Peter","Dylan"),
                  b = c(1, 2, 3),
                  c = c("yipee", "ki", "yay"))

df2 <- data.frame(
  a = c("Ray","Bob","Derek"),
  b = c(4, 5, 6),
  c = c("yum", "yummy", "donuts"))


df3 <- data.frame(
  a = c("Bill","Sam","Nate"),
  b = c(7, 8, 9),
  c = c("I", "eat", "cake"))

l <- list(df1, df2, df3)

drop_col <- function(x) {
  x <- x[, !names(x) %in% c("e", "b", "f")]
  return(x)
}

l <- lapply(l, drop_col)

When I call the list l, I get a list of data frames with the changes I want. When I call an element in the list, df1 or df2 or df3, they do not have a dropped column.

I've looked at this solution and many others, I'm obviously missing something.

Upvotes: 0

Views: 766

Answers (3)

nigelhenry
nigelhenry

Reputation: 489

The problem is that when you are creating l, you are filling it with copies of your data frames df1, df2, df3. In R, it is not generally possible to pass references to variables. One workaround is to create an environment as @Ronak Shah does.

Another is to use get() and <<- to change the variable within the function.

drop_cols <- function(x) {
  for(iter in x)
    do.call("<<-", list(iter, drop_col(get(iter))))
}
drop_cols(c("df1","df2","df3"))

Upvotes: 2

hello_friend
hello_friend

Reputation: 5788

df1 <- data.frame(
  a = c("John","Peter","Dylan"),
  b = c(1, 2, 3),
  c = c("yipee", "ki", "yay"))

df2 <- data.frame(
  a = c("Ray","Bob","Derek"),
  b = c(4, 5, 6),
  c = c("yum", "yummy", "donuts"))


df3 <- data.frame(
  a = c("Bill","Sam","Nate"),
  b = c(7, 8, 9),
  c = c("I", "eat", "cake"))
# Name the list elements:
l <- list(df1 = df1, df2 = df2, df3 = df3)

drop_col <- function(x) {
  x <- x[, !names(x) %in% c("e", "b", "f")]
  return(x)
}

l <- lapply(l, drop_col)

# View altered dfs:
View(l["df1"])

Upvotes: 0

Ronak Shah
Ronak Shah

Reputation: 388817

l list and df1 , df2 etc. dataframes are independent. They have nothing to do with each other. One way to get new changed dataframes is to assign names to the list and create new dataframe.

l <- lapply(l, drop_col)
names(l) <- paste0("df", 1:3)
list2env(l, .GlobalEnv)

Upvotes: 2

Related Questions