m.rodwell
m.rodwell

Reputation: 39

Add new column name to a list of data frames from a part of the file name using lapply

I have 10 excel files in a folder that all have the same columns. The end result is bind_rows and combine them all. Each file represents a week (in this case weeks 1-10). I am looking to see how I can add a new column called "Week" to the final product. Each file name looks like "...wk1.xlsx", "...wk2.xlsx", etc. I am trying to figure out how I can detect "wk1", etc and format that into a new column in a list of data frames.

Here's what I have...

files <- list.files(path ="Users/Desktop/week", pattern = "*.xlsx", full.names= T) %>%
         lapply(read_excel, sheet =4, skip =39) %>%
         bind_rows()

Upvotes: 1

Views: 1121

Answers (1)

zephryl
zephryl

Reputation: 17069

Name the list of filenames using setNames(), then use the .id argument in bind_rows(), which adds a column containing list names.

library(tidyverse)
library(readxl)

files <- list.files(path ="Users/Desktop/week", pattern = "*.xlsx", full.names= T) %>%
  setNames(nm = .) %>% 
  lapply(read_excel, sheet =4, skip =39) %>%
  bind_rows(.id = "Week") %>%
  mutate(Week = str_extract(Week, "wk\\d+"))

You could also combine the iteration and row-binding steps using purrr::map_dfr():

files <- list.files(path ="Users/Desktop/week", pattern = "*.xlsx", full.names= T) %>%
  setNames(nm = .) %>% 
  map_dfr(read_excel, sheet = 4, skip = 39, .id = "Week") %>%
  mutate(Week = str_extract(Week, "wk\\d+"))

Upvotes: 2

Related Questions