Reputation: 426
I want to list unique IDs within groups, where the grouping variable can be selected by the user. The following works:
if(useGroupVar1){
dt[,unique(id),.(group1a,group1b,group1c)]
} else {
dt[,unique(id),group2]
}
The expressions I'm using in my code to filter rows are actually fairly long so I want to avoid duplicating code. I came up with this "solution", which doesn't actually work:
dt[,unique(id),if(useGroupVar1){.(group1a,group1b,group1c)}else{group2}]
If the condition leads to using group2
alone, it works (though the column is called if
), but trying to get it to use .(group1a,group1b,group1c)
results in
Error in eval(expr, envir, enclos) : could not find function "."
Now, I read .()
is an alias to list()
, so using the latter gets me this
Error in bysubl[[jj + 1L]] : subscript out of bounds
Is there a way to implement a conditional by
without duplicating entire expressions?
Upvotes: 0
Views: 407
Reputation: 70246
Just personal preference, but I don't like pasting strings in a by=
statement of a data.table (not very readable to me).
Instead, I would use a user-selected variable (var
) and create a list of grouping variables. Then, you can easily select the variables like so:
groupVars <- list(
GroupVar1 = c("group1a","group1b","group1c"),
GroupVar2 = c("groupXYZ", "groupABC"),
GroupVarX = "group2"
)
# user selects that - for example - var = "GroupVar2"
dt[, unique(id), by = groupVars[[var]]]
As a side note:
You can easily extend this kind of variable selection for situations where a user is allowed to select multiple sets of grouping variables. In such cases, you could it as follows:
Assume, that the user-selected variable is now:
var <- c("GroupVar1", "GroupVarX") # two groups selected
Then, the by=
statement becomes:
dt[, unique(id), by = unlist(groupVars[var], use.names=FALSE)]
Upvotes: 5