Comparing column headers from multiple sheets in an excel file and fetch it to R

Question

So i have an excel file which has data in several sheets that i have to consolidate so i can provide insights from it:

These sheets are named after each month starting from November .....October (in total: 12 sheets)

My code starts out like this:

#List of months to look at
months = c("Novemeber", "December", "January", "February", "March", "April", "May", "June", "July", "August", "September")

What i want to do is match column names from each of these sheets with an empty df (i call it discrepancies) and fetch data to those columns accordingly. My code is like this

discrepancies <-
  setNames(
    data.frame(matrix(ncol = 12, nrow = 0)),
    c(
      "Date",
      "Officer",
      "Case Number",
      "Account Number",
      "Plan Type",
      "Type",
      "ID",
      "Transaction Amount",
      "Code",
      "Specialist",
      "Transit#",
      "Processed Via"
      )
  )
#Query for each month's data and append to the main dataframe
for (i in months) {
  temp <- read_excel(
    "G:/Confidental.xlsx",
    sheet = i,
    col_names = TRUE,
    skip = 0
  )
  temp$`months` <- i
  discrepancies <- rbind(discrepancies, temp)
}

This code is taking every field in the sheet compared to just the columns i want and it gets stuck when one sheet has different number of columns than the one in discrepancies df. Any help is appreciated.

Ronak Shah · Accepted Answer

I don't think you need to create an empty dataframe to compare all the columns. Try this approach :

library(readxl)
result <- purrr::map_df(months, ~read_excel("G:/Confidental.xlsx",sheet = .x), 
                       .id = 'months')

This would combine in all the sheets on one dataframe. If some of the column are absent in the sheet this will automatically insert NA for those columns in those month.

Comparing column headers from multiple sheets in an excel file and fetch it to R

Answers (2)

Related Questions