R openxlsx writeFormula gives a corrupted file

Question

I am trying to overwrite an Excel file with some new data.

My new data looks like this

new_data <- structure(list(date=c(1:30),var1 = c(NA_character_, NA_character_, 
NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, 
NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, 
NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, 
NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, 
NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, 
NA_character_, NA_character_, NA_character_), var2 = c("=B13-B12", 
"=B14-B13", "=B15-B14", "=B16-B15", "=B17-B16", "=B18-B17", "=B19-B18", 
"=B20-B19", "=B21-B20", "=B22-B21", "=B23-B22", "=B24-B23", "=B25-B24", 
"=B26-B25", "=B27-B26", "=B28-B27", "=B29-B28", "=B30-B29", "=B31-B30", 
"=B32-B31", "=B33-B32", "=B34-B33", "=B35-B34", "=B36-B35", "=B37-B36", 
"=B38-B37", "=B39-B38", "=B40-B39", "=B41-B40", "=B42-B41")), row.names = c(NA, 
-30L), class = c("tbl_df", "tbl", "data.frame"))

My dataset has more columns but those are the ones I identified as the ones messing up the file, specifically the var2 one.

Then I just turn my data from wide to long so I can group the data and write all columns with group_map.

update <- new_data %>%
                mutate_if(is.numeric,as.character) %>%
                pivot_longer(-date,
                             names_to = "var",
                             values_to = "value") %>% 
                select(var,value) %>% 
                mutate(col=c(1:2))

(NAs are actually in a numeric column, that is why I need to change numeric to character -plus I have more numeric columns within my dataset)

This is how I am writing my data into the excel file.

library(openxlsx)
wb <- loadWorkbook(paste(path,"file.xlsx",sep='/'))

update %>% 
  group_by(var) %>% 
  group_map(~writeFormula(wb=wb, sheet ="FirstSheet", x=.x$value,
                          startCol=.x$col,startRow=13))
saveWorkbook(wb,paste(path,"new_file.xlsx",sep='/'),overwrite = TRUE)

I am not sure what I am doing wrong but I guess it is due to the NA values since I have other columns coded the exact same way and they seem to be working just fine (with some I get a warning saying that the formula differs from the ones around them but I can live with that).

How should I manage this NA values? (they need to remain as missing values, for example a zero would be incorrect)

I tried changing them to empty strings (new_data[is.na(new_data)]<-"") before writing them into the file but the problem is still there.

Note: The message I get when I try to open the file is that it needs to be repaired. If I do repair it, it looks fine but I cannot reopen it again once closed.

gaut · Accepted Answer

The below works for me as expected: non-missing formulas are written, NA's are left blank. No corrupted message from Excel.

update_df <- new_data %>%
  mutate_if(is.numeric,as.character) %>%
  pivot_longer(-date,
               names_to = "var",
               values_to = "value") %>% 
  select(var,value) %>% 
  mutate(col=3)



library(openxlsx)
wb <- loadWorkbook(paste("file.xlsx"))
  # group_by(var) %>% # filter(is.na(value)) %>% 
  purrr::map(updatedf,.f= ~ writeFormula(wb, x= updatedf$value, sheet=1, startCol = c(1), startRow = 1))
saveWorkbook(wb,"new_file.xlsx",overwrite = TRUE)

Check if:

Is file.xlsx corrupted? What happens if you load it and write it without any modifications, can you open file.xlsx and new_file.xlsx then without a warning?
Can you reproduce the above solution by using a blank file.xlsx? If you still see corruption in new_file.xlsx after starting from a blank file.xlsx, this would indicate an issue with either excel or openxlsx.
What version of Excel, R and openxlsx are you running?

My config:

R version 4.0.4 (2021-02-15)
openxlsx_4.2.5
Microsoft® Excel® 2019 MSO (Version 2207 Build 16.0.15427.20166) 32-bit

EDIT After discussion, the problem might come from overwriting existing formulas.

Try to write the same formulas into other cells in your excel file where you're sure they won't interfere
Try deleting existing content before writing with something like:

deleteData(wb, sheet = 1,cols = unique(update_df$col), rows = 1:60, gridExpand = TRUE)

R openxlsx writeFormula gives a corrupted file

Answers (1)

Related Questions