Christian
Christian

Reputation: 1052

openxlsx: read formula in cell as string

openxlsx::writeFormula lets us write a character vector as an Excel formula, such as SUM(B2:B4). However, the package does not have a readFormula counterpart that lets us read an Excel formula as a character vector when reading from a workbook, receiving strings such as SUM(B2:B4). I've tried to check through the workbook object with no success of finding these strings. Any idea how to find the formulas in the workbook object and convert it to a string?

Upvotes: 3

Views: 1962

Answers (3)

mayeulk
mayeulk

Reputation: 148

You can use openxlsx2 (which is largely based on openxlsx), with an option added in September 2022 (https://github.com/JanMarvin/openxlsx2/pull/352).

openxlsx2::read_xlsx(xlsxFile = path_of_xlsx_file,
  sheet = 'mySheet',
  showFormula = TRUE)

showFormula is an option (passed through ...) of the underlying wb_to_df() function (workbook to data.frame).

Migration from openxlsx to openxlsx2 is discussed in https://cran.r-project.org/web/packages/openxlsx2/vignettes/Update-from-openxlsx.html

Upvotes: 3

Christian
Christian

Reputation: 1052

The other answer from Marco_CH works great and comes right out of the box, but the downside about using the xlsx package is that it requires Java. I did, however, eventually find a solution with openxlsx:

readFormula <- function(wb, sheet, row, col) {
  require(dplyr)

  if (!is.numeric(sheet)) {
    sheet <- which(wb$sheet_names == sheet)
  }
  
  sheet_data <- wb[["worksheets"]][[sheet]][[".->sheet_data"]]
  cell <- ((sheet_data$cols == col) & (sheet_data$rows == row))
  formula <- 
    sheet_data$f[cell] %>% 
    stringr::str_remove("^<f>") %>% 
    stringr::str_remove("</f>$")
  
  return(formula)
}

Upvotes: 4

Marco_CH
Marco_CH

Reputation: 3294

I think openxlsx doesn't have this option.

But you could use:

library(xlsx)

sheetname=1
xls_file = read.xlsx(file, sheetName=sheetName, header=TRUE, keepFormulas=TRUE)

Output:

> xls_file
   Project       Date Phase                       Test
1        A 2021-01-01  Init                        2*6
2        A 2021-04-10   P2O                        2*7
3        B 2021-02-01  Init                        2*6
4        B 1931-03-20   P2O       IF(1=1, TRUE, FALSE)
5        B 2021-04-01 Build                        2*6
6        B 2021-08-01   Doc                        2*7
7        C 2021-03-10  Init                        2*6
8        C 2021-03-31   P2O VLOOKUP(B9,H6:L15,2,FALSE)
9        C 2021-05-20 Build                        2*6
10       D 2021-01-30  Init                        2*7
11       D 2021-07-30   P2O                        2*6

Column Test is the one with the formulas:

enter image description here

Upvotes: 1

Related Questions