Reputation: 127
a<-c(19,24,34,47,47,47)
b<-c(3,14,24,25,47,47)
I want to know how many values in a
match those in b
, however i'm running into issues - when there are duplicate numbers present in both vectors. My desired answer for the above example would be 3
- because 24,47,47 - are shared between the two vectors.
If I use intersect:
intersect(a,b)
[1] 24 47
The 2nd matching 47 is ignored.
If I use %in%:
length(which(a %in% b))
[1] 4
The extra 47 in a is also counted.
I realise that I can do:
length(which(b %in% a))
[1] 3
However, I may also have cases where there is an extra matching value in b instead of a and so %in% is also not useful. For example:
a<-c(19,24,34,7,47,47)
b<-c(3,14,24,47,47,47)
length(which(b %in% a))
[1] 4 (I want the answer to still be 3)
So, without rearranging which vector comes first in the %in% function, for each test - I cannot figure out how to do this. Can somebody show me how?
Upvotes: 1
Views: 1240
Reputation: 101034
You can use table
+ stack
like below
with(
as.data.frame.matrix(table(stack(list(a = a, b = b)))),
sum((p <- pmin(a, b))[p > 0])
)
which gives
[1] 3
where
> as.data.frame.matrix(table(stack(list(a = a, b = b))))
a b
3 0 1
14 0 1
19 1 0
24 1 1
25 0 1
34 1 0
47 3 2
Upvotes: 0
Reputation: 1
Reference: https://datascience.stackexchange.com/questions/9317/how-to-get-common-values-between-two-multi-sets
I have a method to make it Example:
a <- c(19,24,34,47,47,47)
b <- c(3,14,24,25,47,47)
d <- intersect(unique(a), unique(b))
min(length(a[a %in% d]),length(b[b %in% d]))
Using the min method to find. Image: Image Display
Upvotes: 0
Reputation: 1387
How about:
sum(pmin(
table(a[a %in% intersect(a, b)]),
table(b[b %in% intersect(a, b)])
))
We make table()
s of the chunks of a
, b
that are common to both, then we take the smallest numbers from those tables and add them up.
Upvotes: 2