Reputation: 539
I have many data frames - Here is a simplified version of two of them.
flows <- structure(list(Student = c("Adam", "Char", "Fred", "Greg", "Ed", "Mick", "Dave", "Nick", "Tim", "George", "Tom"),
Class = c(1L, 1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 3L, 3L), Jan_18_score = c(NA, 5L, -7L, 2L, 1L, NA, 5L, 8L, -2L, 5L, NA),
Feb_18_score = c(2L, 0, 8L, NA, 2L, 6L, NA, 8L, 7L, 3L, 8L), Jan_18_Weight = c(150L, 30L, NA, 80L, 60L, 80L, 40L, 12L, 23L, 65L, 78L),
Feb_18_Weight = c(153L, 60L, 80L, 40L, 80L, 30L, 25L, 45L, 40L, NA, 50L)), class = "data.frame", row.names = c(NA, -11L))
returns <- structure(list(Student = c("Adam", "Char", "Fred", "Greg", "Ed", "Mick", "Dave", "Nick", "Tim", "George", "Tom"),
Class = c(1L, 1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 3L, 3L), Jan_20_score = c(NA, 5L, -7L, 2L, 1L, NA, 5L, 8L, -2L, 5L, NA),
Feb_20_score = c(2L, 0, 8L, NA, 2L, 6L, NA, 8L, 7L, 3L, 8L), Jan_20_Weight = c(150L, 30L, NA, 80L, 60L, 80L, 40L, 12L, 23L, 65L, 78L),
Feb_20_Weight = c(153L, 60L, 80L, 40L, 80L, 30L, 25L, 45L, 40L, NA, 50L)), class = "data.frame", row.names = c(NA, -11L))
I am using lapply to remove some observations, I would like to do this across all my dataframes and keep the output as dataframes, basically update the existing dataframes and remove the observations I select.
Here is my current code.
df.list <- list(flows, returns)
lapply(df.list, function(df) df[!grepl("1", df$Class),])
However, when I do this the output is not updating the original dataframes and is outputting as a list in the global environment. Any help is appreciated.
Upvotes: 0
Views: 77
Reputation: 1203
Another solution:
flows <- structure(list(Student = c("Adam", "Char", "Fred", "Greg", "Ed", "Mick", "Dave", "Nick", "Tim", "George", "Tom"),
Class = c(1L, 1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 3L, 3L), Jan_18_score = c(NA, 5L, -7L, 2L, 1L, NA, 5L, 8L, -2L, 5L, NA),
Feb_18_score = c(2L, 0, 8L, NA, 2L, 6L, NA, 8L, 7L, 3L, 8L), Jan_18_Weight = c(150L, 30L, NA, 80L, 60L, 80L, 40L, 12L, 23L, 65L, 78L),
Feb_18_Weight = c(153L, 60L, 80L, 40L, 80L, 30L, 25L, 45L, 40L, NA, 50L)), class = "data.frame", row.names = c(NA, -11L))
returns <- structure(list(Student = c("Adam", "Char", "Fred", "Greg", "Ed", "Mick", "Dave", "Nick", "Tim", "George", "Tom"),
Class = c(1L, 1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 3L, 3L), Jan_20_score = c(NA, 5L, -7L, 2L, 1L, NA, 5L, 8L, -2L, 5L, NA),
Feb_20_score = c(2L, 0, 8L, NA, 2L, 6L, NA, 8L, 7L, 3L, 8L), Jan_20_Weight = c(150L, 30L, NA, 80L, 60L, 80L, 40L, 12L, 23L, 65L, 78L),
Feb_20_Weight = c(153L, 60L, 80L, 40L, 80L, 30L, 25L, 45L, 40L, NA, 50L)), class = "data.frame", row.names = c(NA, -11L))
df.list <- list(flows, returns)
Now, we need to assign lapply
to some value and name it:
a <- lapply(df.list, function(df) df[!grepl("1", df$Class),])
names(a) <- c("flows","returns")
After this, we call list2env
function:
list2env(a, envir = .GlobalEnv)
Output:
> flows
Student Class Jan_18_score Feb_18_score Jan_18_Weight Feb_18_Weight
5 Ed 2 1 2 60 80
6 Mick 2 NA 6 80 30
7 Dave 3 5 NA 40 25
8 Nick 3 8 8 12 45
9 Tim 3 -2 7 23 40
10 George 3 5 3 65 NA
11 Tom 3 NA 8 78 50
> returns
Student Class Jan_20_score Feb_20_score Jan_20_Weight Feb_20_Weight
5 Ed 2 1 2 60 80
6 Mick 2 NA 6 80 30
7 Dave 3 5 NA 40 25
8 Nick 3 8 8 12 45
9 Tim 3 -2 7 23 40
10 George 3 5 3 65 NA
11 Tom 3 NA 8 78 50
Checking classes of the outputs:
> class(returns)
[1] "data.frame"
> class(flows)
[1] "data.frame"
Upvotes: 2
Reputation: 2026
I'm not sure about using lapply
but you can work with lists of variables by name using get and assign.
flows <- structure(list(Student = c("Adam", "Char", "Fred", "Greg", "Ed", "Mick", "Dave", "Nick", "Tim", "George", "Tom"),
Class = c(1L, 1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 3L, 3L), Jan_18_score = c(NA, 5L, -7L, 2L, 1L, NA, 5L, 8L, -2L, 5L, NA),
Feb_18_score = c(2L, 0, 8L, NA, 2L, 6L, NA, 8L, 7L, 3L, 8L), Jan_18_Weight = c(150L, 30L, NA, 80L, 60L, 80L, 40L, 12L, 23L, 65L, 78L),
Feb_18_Weight = c(153L, 60L, 80L, 40L, 80L, 30L, 25L, 45L, 40L, NA, 50L)), class = "data.frame", row.names = c(NA, -11L))
returns <- structure(list(Student = c("Adam", "Char", "Fred", "Greg", "Ed", "Mick", "Dave", "Nick", "Tim", "George", "Tom"),
Class = c(1L, 1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 3L, 3L), Jan_20_score = c(NA, 5L, -7L, 2L, 1L, NA, 5L, 8L, -2L, 5L, NA),
Feb_20_score = c(2L, 0, 8L, NA, 2L, 6L, NA, 8L, 7L, 3L, 8L), Jan_20_Weight = c(150L, 30L, NA, 80L, 60L, 80L, 40L, 12L, 23L, 65L, 78L),
Feb_20_Weight = c(153L, 60L, 80L, 40L, 80L, 30L, 25L, 45L, 40L, NA, 50L)), class = "data.frame", row.names = c(NA, -11L))
df.list <- list("flows", "returns")
for (df.name in df.list){
temp <- get(df.name)
temp <- temp[!grepl("1", temp$Class), ]
assign(paste0(df.name, "_new"), temp)
}
Remove "_new" to overwrite the original variables.
Upvotes: 1