Rodrigo de Alexandre
Rodrigo de Alexandre

Reputation: 662

counting matching elements of two vectors but not including repeated elements in the count

I've search a lot in this forum. However, I didn't found a similar problem as the one I'm facing.

My question is:
I have two vectors
x <- c(1,1,2,2,3,3,3,4,4,4,6,7,8) and z <- c(1,1,2,4,5,5,5)

I need to count the number of times x or z appears in each other including if they are repeated or not.

The answer for this problem should be 4 because :
There are two number 1, one number 2, and one number 4 in each vector.

Functions like match() don't help since they will return the answer of repeated for non repeated numbers. Using unique() will also alter the final answer from 4 to 3

What I came up with was a loop that every time it found one number in the other, it would remove from the list so it won't be counted again.
The loop works fine for this size of this example; however, searching for larger vectors numerous times makes my loop inefficient and too slow for my purposes.

system.time({
    for(n in 1:1000){
        x <- c(1,1,2,2,3,3,3,4,4,4,6,7,8)
        z <- c(1,1,2,4,5,5,5)
        score <- 0
        for(s in spectrum){
            if(s %in% sequence){
                sequence <- sequence[-which(sequence==s)[1]]
                score <- score + 1
            }
        }
    }
})

Can someone suggest a better method?
I've tried using lapply, for short vectors it is faster, but it became slower for longer ones..

Upvotes: 1

Views: 3483

Answers (1)

Rich Scriven
Rich Scriven

Reputation: 99321

Use R's vectorization to your advantage here. There's no looping necessary.

You could use a table to look at the frequencies,

table(z[z %in% x])
# 
# 1 2 4 
# 2 1 1 

And then take the sum of the table for the total

sum(table(z[z %in% x]))
# [1] 4

Upvotes: 4

Related Questions