Reputation: 105
I have multiple excel files, each with different worksheets. I have tried to use readxl and map to import it to R. However, I was only able to do it using a for loop. The code below works fine but I would like to know if there is a clever way to do this. I keep thinking that I could have done it with map2 but I am missing something.
library(tidyverse)
library(readxl)
library(writexl)
### As a first step, I get all the files from my project folder and create an empty list for looping purposes
files <- list.files(pattern = ".xlsx")
data_xlsx <- list()
### I then use seq_along in all the files and map_df to read the each excel file
for (i in seq_along(files)) {
data_xlsx[[i]] <- files[i] %>%
excel_sheets() %>%
set_names() %>%
map_df(
~ read_xlsx(path = files[i], sheet = .x, range = "H3"),
.id = "sheet")
}
# I use the code below to get the files name into the list
data_xlsx <- set_names(data_xlsx, files)
# This final code is just to transform the list into a data frame with a column with the name of the files
data_xlsx_df <- map2_df(data_xlsx, files, ~update_list(.x, file = .y))
Created on 2018-07-01 by the reprex package (v0.2.0).
Upvotes: 3
Views: 2635
Reputation: 1281
You can use nested map_df
calls to replace the for loop. As far as I know map2
can only operate on two lists of length n
and return a list of length n
, I don't think it's a way to generate a length n * m
list from two lists of length n
and m
.
files <- list.files(pattern = ".xlsx")
data_xlsx_df <- map_df(set_names(files), function(file) {
file %>%
excel_sheets() %>%
set_names() %>%
map_df(
~ read_xlsx(path = file, sheet = .x, range = "H3"),
.id = "sheet")
}, .id = "file")
Upvotes: 10