lizronan22
lizronan22

Reputation: 7

Exporting large list of dataframes to csv after split

I am creating a script that will split a dataframe into groups based on the column "participant id" and export those dataframes as csvs. Right now, I am working with a dataframe that only has 7 participant ids, but the script will be used for a csv with hundreds of participants.

First, here is a list of dataframes:

participant_id <- c("1", "1", "1", "2", "2", "2", "3", "3", "3", "4", "4", "4")

text <- c("Message1","Message1","Message1",
 "Message2", "Message2", "Message2", 
 "Message3", "Message3", "Message3", 
 "Message4", "Message4", "Message4")

df <- data.frame(participant_id, text)

df_list <- split(df, df$participant_id)

Then I clean each dataframe in the list using a function I wrote called clean_log (the dataframe contains texting logs):

df_list <- lapply(df_list, clean_log)

I want to write each dataframe in this list to a csv, but the dataframes need to be saved as objects first. I tried naming them:

names <- c()

for (i in 1:length(df_list)) {
  names <- c(names, paste0("df", i))
}

names(df_list) <- names

Then I tried exporting the dataframes in a for loop but got the error "Error in get(names[i]) : object 'df1' not found:

for (i in 1:length(names)) {
  write.csv(get(names[i]),
            paste0(path, names[i], ".csv"), row.names = FALSE)
}

I know I could simply write

df1 <- df_list[1]
df2 <- df_list[2]
...

To name each dataframe, but this isn't going to work when there are hundreds of dataframes to export. Has anyone run into an issue like this or have any advice?

Upvotes: 0

Views: 421

Answers (1)

Nick Ulle
Nick Ulle

Reputation: 419

You don't need to use get here, and in general using get is a bad practice.

Your object df_list is a list of data frames. The first argument to write.csv should be the data frame you want to save. So you can write your loop as:

paths = paste0(path, names, ".csv")
for (i in seq_along(df_list)) {
  write.csv(df_list[[i]], paths[i], row.names = FALSE)
}

I lifted the paste0 call out of the loop because it's more efficient and idiomatic to use vectorization to create the file paths.

Upvotes: 1

Related Questions