Reputation: 111
I want to merge columns of multiple xlsx files (about 100) which all have the same structure, meaning the all have 5 rows and 2 columns.
I'm using the answer provided here (with a slight edit to merge the columns instead of rows)
library(purrr)
library(writexl)
files <- list.files(pattern="*.xlsx")
mainDF <- files %>% map_dfc(read.xlsx)
write_xlsx(mainDF, "merge.xlsx")
but what I end up with is that the first row of every sheet is now the column name.
How do I keep the original column names in the merged file?
Also in the merged file there are new columns named "file" which display the file name, I would like to remove those as well.
Any help would be appreciated :)
EDIT
Example: I have file1.xlsx, file2.xlsx looking like
Data col1 Data col2
x 1 x 4
y 3 y 6
and my output is
1 4
3 6
but my goal is to have mergefile.xlsx
Data col1 col2
x 1 4
y 3 6
Upvotes: 1
Views: 385
Reputation: 399
I think your problem is that read.csv
doesn't set colnames of the dataframe that it produces to the headers of the input file. You can use readr::read_csv
to get this behavior.
library(dplyr)
library(readr)
files <- list.files(pattern="*.csv")
dfs <- lapply(files, read_csv)
combined_df <- Reduce(function(x, y) full_join(x, y, by = "Data"), dfs)
write_xlsx(combined_df, "merge.xlsx")
Upvotes: 0