Reputation: 313
This is a small challenge within a big project, so I'm going to try to keep this simple.
I'm attempting to conditionally add columns to a data.table
, and then process them on a conditional basis.
x <- T
y <- data.table(a = 1:10, b = c(rep(1,5), rep(2,5)))
y[ # filter some rows
a != 1
][ # conditionally add two calculated columns
,
if(x){
`:=` (
c = a*b,
d = 1/b
)
}
][ # process columns and group
,
list(
a = sum(a),
b = sum(b),
if(x) c = sum(c) # only add c if it's created above
),
by = if(x) list(b, d) else list(b) # only group by d if it's created above
]
Here is the output (error references the second set []
):
Error in eval(expr, envir, enclos) : object 'd' not found
In addition: Warning message:
In deconstruct_and_eval(m, envir, enclos) :
Caught and removed `{` wrapped around := in j. := and `:=`(...) are
defined for use in j, once only and in particular ways. See help(":=").
Of course, the error is a symptom of the warning. How can I get this done?
As @Michal pointed out, putting the if()
statement outside the data.table
call is an option:
if(x) {
y[
...
]
} else {
y[
...
]
}
I'm hoping there's a way to get this done without repeating the code in its entirety, to simplify everything.
Upvotes: 2
Views: 207
Reputation: 49448
I can't think of a way of doing it inside the j-expression
, because of how :=
gets evaluated in there (it really only works if it's at the root of the expression tree), but you could put it in the i-expression
as a workaround:
x = FALSE
y[a != 1][x, `:=`(c = a * b, d = 1/b)][]
# a b
#1: 2 1
#2: 3 1
#3: 4 1
#4: 5 1
#5: 6 2
#6: 7 2
#7: 8 2
#8: 9 2
#9: 10 2
x = TRUE
y[a != 1][x, `:=`(c = a * b, d = 1/b)][]
# a b c d
#1: 2 1 2 1.0
#2: 3 1 3 1.0
#3: 4 1 4 1.0
#4: 5 1 5 1.0
#5: 6 2 12 0.5
#6: 7 2 14 0.5
#7: 8 2 16 0.5
#8: 9 2 18 0.5
#9: 10 2 20 0.5
Since c(1)
is the same as c(1, NULL)
, it can be used to return complete vectors when you're not sure how many elements will compose them.
To conditionally include columns in j
y[
,
c(
list(
a = sum(a),
b = sum(b)
),
if(x) list(c = sum(c))
)
]
And to conditionally include columns in by
y[
,
...,
by = c("b", if(x) "d")
]
by
won't accept a vector
of list
s, but it will accept a vector
of column names.
Upvotes: 2