DevGin
DevGin

Reputation: 453

R read_excel or readxl Multiple Files with Multiple Sheets - Bind

I have a directory full of .xlsx files. They all have multiple sheets. I want to extract the same sheet from all of the files and append them into a tibble.

I have found numerous solutions for extracting multiple sheets from a single Excel file; however, not a single sheet from multiple files.

I have tried:

    paths = as.tibble(list.files("data/BAH", pattern = ".xlsx", full.names = TRUE, all.files = FALSE))

    test <- paths %>% read_xlsx(sheet = "Portal", col_names = TRUE)

I know the "paths" variable contains all of my file names with path. However, I am not sure how to iterate through each file name appending just the specific sheet = "Portal" to a csv file.

The error is:

Error: path must be a string

I have tried to pass in paths as a vector, as a tibble, and tried sub-scripting it as well. All fails.

So, in summary. I have a directory of xlsx files and I need to extract a single sheet from each one and append it to a csv file. I have tried using purrr with some map functions but also could not get it to work.

My goal was to use the Tidy way.

Thanks for any hints.

Upvotes: 3

Views: 2209

Answers (2)

DevGin
DevGin

Reputation: 453

library(tidyverse)    
library(readxl)
library(fs)

# Get all files
xlsx_files <- fs::dir_ls("data/BAH", regexp = "\\.xlsx$")

paths = as_tibble(list.files("data/BAH", pattern = ".xlsx", full.names = TRUE, all.files = FALSE))


#portal_tabs <- map_dfr(paths, read_xlsx, sheet = "Portal", col_names = TRUE)
portal_tabs <- map_dfr(xlsx_files, read_xlsx, sheet = "Portal", col_names = TRUE, .id = 'source')

Upvotes: 1

eastclintw00d
eastclintw00d

Reputation: 2364

You have to use lapply() or map(). Try

test <- lapply(paths, read_xlsx, sheet = "Portal", col_names = TRUE)

or

library(purrr)
test <- map_dfr(paths, read_xlsx, sheet = "Portal", col_names = TRUE)

You can then bind the dataframes with

library(dplyr)
test %>% bind_rows()

Upvotes: 4

Related Questions