Reputation: 61
I'm importing data from multiple excel files using the readxl package and I made a function in my script so that I only import specific sheets that I need
read_excel_sheets <- function(excelDoc) {
sheets <- readxl::excel_sheets(excelDoc)
sheets <- sheets[4:6]
x <- lapply(sheets, function(X) readxl::read_excel(excelDoc, sheet = X))
return(x)
}
#load files in folder
rawfiles <- list.files()
IMPORT <- lapply(rawfiles, FUN = read_excel_sheets)
After loading the files in my folder into my script, IMPORT becomes a list[10] that contains list[3] inside of it, basically lists inside of a list.
Unfortunately, I can't use reduce(full_join) to gather my data into one data table. I've tried working with just one excel file and using unlist() to see if I can get my sheets out of the lists of list but that did not work.
Test <- read_excel_sheets("Hop_L_Trial1.xlsx")
Test_Test <- unlist(Test)
I've also tried
rawfiles <- list.files()
IMPORT <- lapply(rawfiles,
FUN = read_excel_sheets)
Test_3 <- rbindlist(IMPORT)
and received an error "Column 1 of item 1 is length 2 inconsistent with column 2 which is length 6. Only length-1 columns are recycled." Any suggestions on how to join my data into one data table would be greatly appreciated, thank you.
Upvotes: 2
Views: 94
Reputation: 389155
You can use map_df
from purrr
to get data as a single dataframe.
read_excel_sheets <- function(excelDoc) {
sheets <- readxl::excel_sheets(excelDoc)
sheets <- sheets[4:6]
x <- purrr::map_df(sheets, function(X) readxl::read_excel(excelDoc, sheet = X))
return(x)
}
IMPORT <- purrr::map_df(rawfiles, FUN = read_excel_sheets)
You can also use do.call
+ rbind
base R functions.
read_excel_sheets <- function(excelDoc) {
sheets <- readxl::excel_sheets(excelDoc)
sheets <- sheets[4:6]
x <- do.call(rbind, lapply(sheets, function(X) readxl::read_excel(excelDoc, sheet = X)))
return(x)
}
IMPORT <- do.call(rbind, lapply(rawfiles, FUN = read_excel_sheets))
Upvotes: 1
Reputation: 475
For that data.table::rbindlist
approach just set fill=T
:
library(data.table)
dt_list = lapply(5:10, function(i) {
data.table(rnorm(i))
})
dt=rbindlist(dt_list, fill=T)
Upvotes: 1
Reputation: 12506
You can use bind_rows
from the dplyr
package, part of the tidyverse:
x <- list(tibble(x=runif(5)), tibble(x=runif(5)))
y <- list(tibble(x=runif(5)), tibble(x=runif(5)))
z = list(x, y)
ans <- bind_rows(z)
ans
A tibble: 20 x 1
x
<dbl>
1 0.746
2 0.0669
3 0.612
4 0.0702
5 0.990
6 0.301
7 0.177
8 0.799
9 0.242
10 0.741
11 0.651
12 0.113
13 0.927
14 0.506
15 0.477
16 0.922
17 0.217
18 0.566
19 0.539
20 0.327
Use (say) bind_rows(..., .id="Sheet")
if you need to track which element of the list each row came from originally.
Upvotes: 0