Nora_R
Nora_R

Reputation: 61

Selecting specific columns based in column name from several sheets of an excel workbook

I've got an excel workbook with several sheets. Each sheet represents a year and contains several columns. Not all the sheets have the same columns names but I want to pull out specific columns with names that are shared in all sheets.

I've imported the sheet using the following function to make it into a list

    sheets <- readxl::excel_sheets(filename)
    x <- lapply(sheets, function(X) readxl::read_excel(filename, skip = 2, sheet = X))
    if(!tibble) x <- lapply(x, as.data.frame)
    names(x) <- sheets
    x }

I now have a large list containing about 7 elements. I created another list containing the column names I want to pull out from each sheet

select <-list(c("average","total","percentage","year on year"))

How do I use this list to pull out the respective columns from each sheet into a new df?

Sorry if fairly obvious and thanks in advance for any advice!

Upvotes: 0

Views: 264

Answers (1)

akrun
akrun

Reputation: 887641

We may select the columns afterwards. Also, instead of list of element 1, just pass a vector of column names

library(purrr)
library(readxl)
nm1 <- c("average","total","percentage","year on year")
map(sheets, ~ readxl::read_excel(filename, skip = 2, sheet = .x) %>%
                dplyr::select(all_of(nm1)) %>%
                 as_tibble))

Upvotes: 1

Related Questions