Reputation: 30425
Apologies in advance if this is long winded.
I need to mimic the functionality of a particular set of excel spreadsheets. This needs to be done in R.
Getting the data etc is relatively straightforward. I need to now implement the formulae in a given worksheet.
Packages like XLConnect
enable the extraction of the formulae in string format. For a particular worksheet the data is as follows:
exForm <- structure(list(r = c("A2", "B2", "A3", "B3", "A4", "B4", "A5",
"B5", "A6", "B6", "A7", "B7"), formulae2 = c("1", "A2", "A2+1",
"SUM(A$2:A3)", "A3+1", "SUM(A$2:A4)", "A4+1", "SUM(A$2:A5)",
"A5+1", "SUM(A$2:A6)", "A6+1", "SUM(A$2:A7)"), x = c("2", "2",
"3", "3", "4", "4", "5", "5", "6", "6", "7", "7"), y = c("1",
"2", "1", "2", "1", "2", "1", "2", "1", "2", "1", "2")), .Names = c("r",
"formulae2", "x", "y"), class = "data.frame", row.names = c(NA,
-12L))
#> exForm
# r formulae2 x y
#1 A2 1 2 1
#2 B2 A2 2 2
#3 A3 A2+1 3 1
#4 B3 SUM(A$2:A3) 3 2
#5 A4 A3+1 4 1
#6 B4 SUM(A$2:A4) 4 2
#7 A5 A4+1 5 1
#8 B5 SUM(A$2:A5) 5 2
#9 A6 A5+1 6 1
#10 B6 SUM(A$2:A6) 6 2
#11 A7 A6+1 7 1
#12 B7 SUM(A$2:A7) 7 2
There is a neat bit of python code that implements an original javascript piece of code here.
I have passed my data to this set of functions using rPython
.
library(rPython)
excelURL <- "http://www.ewbi.com/ewbi.develop/samples/jsport_nonEAT.py"
download.file(excelURL, "excel.py")
python.exec("execfile('excel.py')")
python.assign("test", exForm$formulae2)
python.exec('t=[]
for i in range(len(test)):
\t p.parse(test[i])
\t t.append(p.prettyprint())
')
parseForm <- python.get('t')
If people dont have rPython
to hand the output of parseForm
is
c("1 <operand> <number>\n", "A2 <operand> <range>\n", "A2 <operand> <range>\n+ <operator-infix> <math>\n1 <operand> <number>\n",
"SUM <function> <start>\n A$2:A3 <operand> <range>\n <function> <stop>\n",
"A3 <operand> <range>\n+ <operator-infix> <math>\n1 <operand> <number>\n",
"SUM <function> <start>\n A$2:A4 <operand> <range>\n <function> <stop>\n",
"A4 <operand> <range>\n+ <operator-infix> <math>\n1 <operand> <number>\n",
"SUM <function> <start>\n A$2:A5 <operand> <range>\n <function> <stop>\n",
"A5 <operand> <range>\n+ <operator-infix> <math>\n1 <operand> <number>\n",
"SUM <function> <start>\n A$2:A6 <operand> <range>\n <function> <stop>\n",
"A6 <operand> <range>\n+ <operator-infix> <math>\n1 <operand> <number>\n",
"SUM <function> <start>\n A$2:A7 <operand> <range>\n <function> <stop>\n"
)
So I gather now parseForm
contains a tokenised version of the original formulae. To utilise in R would I
render the forms in parseForm
as R expressions?
> ex1 <- expression(1 + A1)
> A1 <- 10
> eval(ex1)
[1] 11
Has anyone experience with something similar or can someone point me in the right direction. For additional information I have the data in the worksheets represented as a dataframe so for example sheet1
here which the above are referring to is:
wData <- structure(c(NA, 1, 2, 3, 4, 5, 6, NA, 1, 3, 6, 10, 15, 21), .Dim = c(7L,
2L))
Upvotes: 2
Views: 541
Reputation: 269461
I gather that the idea is that the input is exForm
and the output should be wData
.
The following can be generalized further but its sufficient for the example in the question. Note that it assumes that any cell in exForm
only refers to cells above it (which is the case here and would likely be the case in most situations) so that we can proceed down the rows of exForm
linearly.
ss
removes dollar signs giving x1
, then transforms strings such as A2:A4
to wData[2:4, "A"]
giving x2
and then converts strings with embedded A2
and the like to strings with embedded wData[2, "A"]
and the like. What is left can be parsed as R and we do so. The for loop traverses exForm
linearly from the top applying ss
to each cell defined in exForm
. No addon packages are needed.
SUM <- sum
ss <- function(x) {
x1 <- gsub("$", "", x, fixed = TRUE)
x2 <- gsub("([[:alpha:]]+)([[:digit:]]+):\\1([[:digit:]]+)",
"wData[\\2:\\3, '\\1']", x1, perl = TRUE)
x3 <- gsub("([[:alpha:]]+)([[:digit:]]+)", "wData[\\2, '\\1']", x2)
eval.parent(parse(text = x3))
}
wData <- matrix(NA, nr = max(as.numeric(exForm$x)), nc = max(as.numeric(exForm$y)))
colnames(wData) <- LETTERS[1:ncol(wData)]
for(i in 1:nrow(exForm)) {
x <- as.numeric(exForm$x[i])
y <- as.numeric(exForm$y[i])
wData[x, y] <- ss(exForm$formulae2[i])
}
giving:
> wData
A B
[1,] NA NA
[2,] 1 1
[3,] 2 3
[4,] 3 6
[5,] 4 10
[6,] 5 15
[7,] 6 21
REVISED A number of corrections and simplifications were made.
Upvotes: 1