Reputation: 3629
I have a large data table like the following:
id var1 var2
1 1 a
2 2 d
3 6 d
4 4 b
5 6 d
6 8 a
I need to assign a category in var2
based on the values in var1
. The categories do not follow any order with respect to var1
values included in each category. For instance:
lista <- c(1,5,7)
listb <- c(4,9)
listd <- c(2,6)
I have tried two approaches unsuccessfully.
Using the which
function:
which: DT[which(var1 %in% lista), var2 := "a"]
and so on for the listb
and listd
.
It also didn't work the function approach
(which may also be too slow for my large data table as it would have many elseif
clauses). I wrote:
matchfun <- function(value){
if (var1 %in% lista){
value <- as.character(a)} else {
return(value)}}
Any idea or comment on how to allocate factor/categories to group of values is very welcome.
Upvotes: 1
Views: 85
Reputation: 66819
I'd suggest a merge here. Let DT
be your original data table.
DT <- data.table(id=1:6,var1=c(1,2,6,4,6,8))
First, you need to store your mapping in a table:
matchDT <- rbindlist(list(
data.table(var1=lista,var2="a"),
data.table(var1=listb,var2="b"),
data.table(var1=listd,var2="d")
))
Then you can merge, optionally setting id
as the key afterward to restore the original sorting.
setkey(DT,var1)
DT[matchDT,var2:=var2,nomatch=FALSE]
setkey(DT,id)
The result is
id var1 var2
1: 1 1 a
2: 2 2 d
3: 3 6 d
4: 4 4 b
5: 5 6 d
6: 6 8 NA
The last value is NA
because your lista
object doesn't contain 8
(but should).
Upvotes: 3