maryam
maryam

Reputation: 111

R - Keeping column names after merging multiple .xlsx files

I want to merge columns of multiple xlsx files (about 100) which all have the same structure, meaning the all have 5 rows and 2 columns.

I'm using the answer provided here (with a slight edit to merge the columns instead of rows)

library(purrr)
library(writexl)


files <- list.files(pattern="*.xlsx")
mainDF <- files %>% map_dfc(read.xlsx) 

write_xlsx(mainDF, "merge.xlsx")

but what I end up with is that the first row of every sheet is now the column name.

How do I keep the original column names in the merged file?

Also in the merged file there are new columns named "file" which display the file name, I would like to remove those as well.

Any help would be appreciated :)

EDIT

Example: I have file1.xlsx, file2.xlsx looking like

Data col1              Data  col2
 x    1                 x     4
 y    3                 y     6

and my output is

1 4
3 6

but my goal is to have mergefile.xlsx

Data col1 col2
  x   1    4
  y   3    6

Upvotes: 1

Views: 385

Answers (1)

Akindele Davies
Akindele Davies

Reputation: 399

I think your problem is that read.csv doesn't set colnames of the dataframe that it produces to the headers of the input file. You can use readr::read_csv to get this behavior.

library(dplyr)
library(readr)

files <- list.files(pattern="*.csv")
dfs <- lapply(files, read_csv)

combined_df <- Reduce(function(x, y) full_join(x, y, by = "Data"), dfs)

write_xlsx(combined_df, "merge.xlsx")

Upvotes: 0

Related Questions