Reputation: 57
I have a working directory with a large number of xlsm files (600ish). I need to merge all of these files into one dataframe, but ONLY the second sheet of the excel file. Since there are a lot of files, ideally I would use a loop, but I'm struggling with how to do this. Right now I have this code, which is obviously not working. Any thoughts on how to best do this would be greatly appreciated.
library(readxl)
library(tidyverse)
data.files = list.files(pattern = "*.xlsm")
data_to_merge <- lapply(data.files, read_excel(x, sheet = 2))
combined_df <- bind_rows(data_to_merge)
Not sure how to include examples of the data so it's easily reproducible since my question is dealing with excel sheets, not data that's already in r, but if this is useful, all of the 2nd sheets have the same simple structure that looks something like this:
data1 <- data.frame(id = 1:6,
x1 = c(5, 1, 4, 9, 1, 2),
x2 = c("A", "Y", "G", "F", "G", "Y"))
data2 <- data.frame(id = 4:9,
y1 = c(3, 3, 4, 1, 2, 9),
y2 = c("a", "x", "a", "x", "a", "x"))
Upvotes: 1
Views: 60
Reputation: 16876
You were close. You just need to slightly alter your lapply
statement, so that the function and parameter are separated by a column.
library(readxl)
library(tidyverse)
data.files = list.files(pattern = "*.xlsm")
data_to_merge <- lapply(data.files, read_excel, sheet = 2)
combined_df <- bind_rows(data_to_merge)
Or a more tidyverse
approach:
combined_df <- list.files(pattern = "*.xlsm") %>%
map(., ~ read_excel(.x, sheet = 2)) %>%
bind_rows()
Upvotes: 1