Felipe
Felipe

Reputation: 31

Creating binary operator in r

I need some help on creating a special kind of subtraction. I have a data table x and I must subtract two columns, say a and b. However, either column may not exist. If a column does not exist, its value in the subtraction should be set to zero.

So far, I have approached this problem by trying to define a new subtraction operator, %-%

Thus, for example, if x = data.table(a = 5, b = 2), then a %-% b should be 3, whereas a %-% d should be 5.

I have tried to define this subtraction operator as shown below. However, for some reason, my subtraction always yields zero! Can anyone help me understand what am I doing wrong and how may I correct my code?

library(data.table)
x = data.table(a = floor(10 * runif(5)), b = floor(10 * runif(5)), c =floor(10 * runif(5)))

`%-%` <- function(e1,e2, DT = x){
  ifelse(is.numeric(substitute(e1, DT)), e1 <- substitute(e1, DT), e1 <- 0)
  ifelse(is.numeric(substitute(e2, DT)), e2 <- substitute(e2, DT), e2 <- 0)
  return(e1 - e2)
}

x[, d := a %-% b]
x

x[, d := a %-% d]
x

Many thanks!

Upvotes: 3

Views: 964

Answers (3)

Onyambu
Onyambu

Reputation: 79318

`%-%`=function(a,b){
  DT=eval(sys.status()$sys.calls[[2]][[2]])
  a=substitute(a)
  b=substitute(b)
  stopifnot(is.name(a),is.name(b),is.data.table(DT))
  a=deparse(a)
  b=deparse(b)
  d=numeric(nrow(DT))
  a=if(!exists(a,DT)) d else get(a,DT)
  b=if(!exists(b,DT)) d else get(b,DT)
  a-b
 }
set.seed(5)
x = data.table(a = floor(10 * runif(5)), b = floor(10 * runif(5)), c =floor(10 * runif(5)))
x
   a b c
1: 2 7 2
2: 6 5 4
3: 9 8 3
4: 2 9 5
5: 1 1 2

x[,a%-%b]
[1] -5  1  1 -7  0
x[,a%-%f]# F is just a column of zeros since it does not exist:
[1] 2 6 9 2 1

Or you can just do:

x[,c("d","e","f"):=.(a%-%b,a%-%h,g%-%h)]
x
   a b c  d e f
1: 2 7 2 -5 2 0
2: 6 5 4  1 6 0
3: 9 8 3  1 9 0
4: 2 9 5 -7 2 0
5: 1 1 2  0 1 0

This function is written to work on a datatable only. For example:

 setDF(x)[,a%-%b]

 Error: is.data.table(DT) is not TRUE 
 setDT(x)[,a%-%b]
 [1] -5  1  1 -7  0

EDIT: This answer gives the correct value with regard to the order. (Most of the answers given below do not pass this test)

setDT(x)[,a%-%b]#Column subtract another
[1] -5  1  1 -7  0
setDT(x)[,b%-%a]#Reversing the order
[1]  5 -1 -1  7  0
setDT(x)[,b%-%b]#Column Subtract itself
[1] 0 0 0 0 0
setDT(x)[,a%-%f]#Column subtract a non-existing column
[1] 2 6 9 2 1
setDT(x)[,f%-%a]#a non-existing column subtract an existing column
[1] -2 -6 -9 -2 -1
x[,g%-%f] #subtract two non-existing columns
[1] 0 0 0 0 0

Upvotes: 1

akrun
akrun

Reputation: 887691

We can create a function with intersect for passing the column names into .SDcols, then Reduce by subtracting the corresponding rows of each column in .SD (Subset of Data.table)

f1 <- function(dat, .x, .y) intersect(names(dat), c(.x, .y))
x[, d := Reduce('-', .SD), .SDcols = f1(x, 'a', 'b')]
x[, e := Reduce(`-`, .SD), .SDcols = f1(x, 'a', 'f')]

x
#   a b c  d e
#1: 7 0 8  7 7
#2: 3 6 4 -3 3
#3: 9 9 8  0 9
#4: 3 6 2 -3 3
#5: 0 2 3 -2 0

Or if we want to change the OP's function by passing unquoted arguments, then use enquo to convert it to from quosure and then reconvert it back to string with quo_name. Create an intersection vector from the column names of the dataset, and use - in the Reduce

library(dplyr)
`%-%` <- function(e1,e2, DT){
           e1 <- quo_name(enquo(e1))
           e2 <- quo_name(enquo(e2))
           nm1 <- intersect(names(DT), c(e1, e2))
           DT[, Reduce(`-`, .SD), .SDcols = nm1]
    }

x[, d := `%-%`(a, b, .SD)]
x[, e := `%-%`(a, f, .SD)]

data

x <- structure(list(a = c(7L, 3L, 9L, 3L, 0L), b = c(0L, 6L, 9L, 6L, 
2L), c = c(8L, 4L, 8L, 2L, 3L)), .Names = c("a", "b", "c"), row.names = c("1:", 
"2:", "3:", "4:", "5:"), class = "data.frame")
setDT(x)

Upvotes: 1

YOLO
YOLO

Reputation: 21749

IIUC, you can try this way. We use exist function to ensure if the column is available in the data.

# helper function
do_sub <- function(df, col1 = 'a', col2='b')
{
  ans <- integer()
  if (exists(col1, df) & exists(col2, df)){
    ans <- append(ans, df[[col1]] - df[[col2]])
  } else if (exists(col1, df)){
    ans <- append(ans, df[[col1]] - 0)
  } else {
    ans <- append(ans, 0 - df[[col2]])
  }
  return (ans)

}

# compute new columns
df[, d := do_sub(.SD, col1 = 'a', col2 = 'b')]
df[, e := do_sub(.SD, col1 = 'a', col2 = 'f')]

print(df)

   a b c  d e
1: 7 0 8  7 7
2: 3 6 4 -3 3
3: 9 9 8  0 9
4: 3 6 2 -3 3
5: 0 2 3 -2 0

Upvotes: 0

Related Questions