Reputation: 13
Is there a basic function or function in data.table that can help me do this? I'm looking for an efficient way to deal with big data.
For example:
matrix A:
a b
a c
d b
d a
matrix B:
a b
d c
a d
B %in% A should return T,F,F
Upvotes: 1
Views: 92
Reputation: 49448
If you are indeed dealing with data.table
(not clear from OP), this is one possibility:
A = data.table(c('a','a','d','d'),c('b','c','b','a'))
B = data.table(c('a','d','a'), c('b','c','d'))
setkey(A, V1, V2)
A[B, .N, by = .EACHI] # in data.table <= 1.9.2 use A[B, .N]
# V1 V2 N
#1: a b 1
#2: d c 0
#3: a d 0
You can then do whatever you want with column N
, including converting it to logical if you like:
as.logical(A[B, .N, by = .EACHI]$N)
#[1] TRUE FALSE FALSE
Upvotes: 3
Reputation: 9344
apply(B, 1, list) %in% apply(A, 1, list)
works for both matrices and data.tables (and data.frames).
A <- cbind(c('a','a','d','d'), c('b','c','b','a'))
B <- cbind(c('a','d','a'), c('b','c','d'))
apply(B, 1, list) %in% apply(A, 1, list)
# [1] TRUE FALSE FALSE
identical(.Last.value, apply(data.table(B), 1, list) %in% apply(data.table(A), 1, list))
# [1] TRUE
Upvotes: 1
Reputation: 193527
Here's a possibility: Use duplicated
and rbind
. Using @Robert's sample data:
A <- cbind(c('a','a','d','d'), c('b','c','b','a'))
B <- cbind(c('a','d','a'), c('b','c','d'))
duplicated(rbind(B, unique(A)), fromLast = TRUE)[1:nrow(B)]
# [1] TRUE FALSE FALSE
Upvotes: 1