Reputation: 453
I have a directory full of .xlsx files. They all have multiple sheets. I want to extract the same sheet from all of the files and append them into a tibble.
I have found numerous solutions for extracting multiple sheets from a single Excel file; however, not a single sheet from multiple files.
I have tried:
paths = as.tibble(list.files("data/BAH", pattern = ".xlsx", full.names = TRUE, all.files = FALSE))
test <- paths %>% read_xlsx(sheet = "Portal", col_names = TRUE)
I know the "paths" variable contains all of my file names with path. However, I am not sure how to iterate through each file name appending just the specific sheet = "Portal" to a csv file.
The error is:
Error: path must be a string
I have tried to pass in paths as a vector, as a tibble, and tried sub-scripting it as well. All fails.
So, in summary. I have a directory of xlsx files and I need to extract a single sheet from each one and append it to a csv file. I have tried using purrr with some map functions but also could not get it to work.
My goal was to use the Tidy way.
Thanks for any hints.
Upvotes: 3
Views: 2209
Reputation: 453
library(tidyverse)
library(readxl)
library(fs)
# Get all files
xlsx_files <- fs::dir_ls("data/BAH", regexp = "\\.xlsx$")
paths = as_tibble(list.files("data/BAH", pattern = ".xlsx", full.names = TRUE, all.files = FALSE))
#portal_tabs <- map_dfr(paths, read_xlsx, sheet = "Portal", col_names = TRUE)
portal_tabs <- map_dfr(xlsx_files, read_xlsx, sheet = "Portal", col_names = TRUE, .id = 'source')
Upvotes: 1
Reputation: 2364
You have to use lapply()
or map()
. Try
test <- lapply(paths, read_xlsx, sheet = "Portal", col_names = TRUE)
or
library(purrr)
test <- map_dfr(paths, read_xlsx, sheet = "Portal", col_names = TRUE)
You can then bind the dataframes with
library(dplyr)
test %>% bind_rows()
Upvotes: 4