Reputation: 27
I'm terrible at for loops in R. My typical use case is modifying many dataframes in the same way. In the following example, I am trying to add rownames to each dataframe in a list. It's writing to a new dataframe called df, but I want to overwrite each one individually (make the rownames equal to the values in column B).
df1 <- data.frame(A = c(1,2,3,4,5),
B = c("red","blue","green","orange","yellow"))
df2 <- data.frame(A = c(6,7,8,9,10),
B = c("A","B","C","D","E"))
df3 <- data.frame(A = c(11,12,13,14,15),
B = c("F","G","H","I","J"))
df_list <- list(df1,df2,df3)
for (df in df_list) {
rownames(df) <- df$B
}
Thanks for any help with modifying dataframes in R. I like to start incorporating these regularly into my workflow.
Upvotes: 0
Views: 284
Reputation: 3397
As was pointed out, if you have objects that belong together (e.g., because they have a similar structure or you want to modify them in the same way), keep them as a list and don't ever modify the individual elements in the global environment.
If you insist on doing just that, the code below shows you how to access objects in the global environment (as opposed to modifying objects within the scope of a function).
# create sample data
df1 <- data.frame(A = c(1,2,3,4,5),
B = c("red","blue","green","orange","yellow"))
df2 <- data.frame(A = c(6,7,8,9,10),
B = c("A","B","C","D","E"))
df3 <- data.frame(A = c(11,12,13,14,15),
B = c("F","G","H","I","J"))
# bind as list
df_list <- dplyr::lst(df1, df2, df3)
# define function to change rownames
g <- function(df_name) {
rownames(.GlobalEnv[[df_name]]) <- .GlobalEnv[[df_name]]$B
}
# apply function
purrr::walk(names(df_list), g)
# check result
df1
#> A B
#> red 1 red
#> blue 2 blue
#> green 3 green
#> orange 4 orange
#> yellow 5 yellow
df2
#> A B
#> A 6 A
#> B 7 B
#> C 8 C
#> D 9 D
#> E 10 E
df3
#> A B
#> F 11 F
#> G 12 G
#> H 13 H
#> I 14 I
#> J 15 J
Created on 2023-12-01 with reprex v2.0.2
Whether you use purrr::walk()
or a for-loop doesn't matter here.
Upvotes: 0
Reputation: 6227
There's a typo in your example, df
is a function, not a data.frame you have defined.
Anyway, it's better to use lapply
when dealing with lists. You can create an anonymous function and perform operations using that.
> lapply(df_list, function(x) {rownames(x) = x$B; x})
[[1]]
A B
red 1 red
blue 2 blue
green 3 green
orange 4 orange
yellow 5 yellow
[[2]]
A B
A 6 A
B 7 B
C 8 C
D 9 D
E 10 E
[[3]]
A B
F 11 F
G 12 G
H 13 H
I 14 I
J 15 J
Upvotes: 0