3zhang
3zhang

Reputation: 13

How to do %in% by rows in R?

Is there a basic function or function in data.table that can help me do this? I'm looking for an efficient way to deal with big data.

For example:

matrix A:
a b
a c
d b
d a

matrix B:
a b
d c
a d

B %in% A should return T,F,F

Upvotes: 1

Views: 92

Answers (3)

eddi
eddi

Reputation: 49448

If you are indeed dealing with data.table (not clear from OP), this is one possibility:

A = data.table(c('a','a','d','d'),c('b','c','b','a'))
B = data.table(c('a','d','a'), c('b','c','d'))

setkey(A, V1, V2)

A[B, .N, by = .EACHI] # in data.table <= 1.9.2 use A[B, .N]
#   V1 V2 N
#1:  a  b 1
#2:  d  c 0
#3:  a  d 0

You can then do whatever you want with column N, including converting it to logical if you like:

as.logical(A[B, .N, by = .EACHI]$N)
#[1]  TRUE FALSE FALSE

Upvotes: 3

Robert Krzyzanowski
Robert Krzyzanowski

Reputation: 9344

apply(B, 1, list) %in% apply(A, 1, list)

works for both matrices and data.tables (and data.frames).

Example

A <- cbind(c('a','a','d','d'), c('b','c','b','a'))
B <- cbind(c('a','d','a'), c('b','c','d'))
apply(B, 1, list) %in% apply(A, 1, list)
# [1]  TRUE FALSE FALSE
identical(.Last.value, apply(data.table(B), 1, list) %in% apply(data.table(A), 1, list))
# [1] TRUE

Upvotes: 1

A5C1D2H2I1M1N2O1R2T1
A5C1D2H2I1M1N2O1R2T1

Reputation: 193527

Here's a possibility: Use duplicated and rbind. Using @Robert's sample data:

A <- cbind(c('a','a','d','d'), c('b','c','b','a'))
B <- cbind(c('a','d','a'), c('b','c','d'))
duplicated(rbind(B, unique(A)), fromLast = TRUE)[1:nrow(B)]
# [1]  TRUE FALSE FALSE

Upvotes: 1

Related Questions