offeltoffel
offeltoffel

Reputation: 2801

R: Subset of vector for multiple matches

I ran into a seemingly easy-to-solve problem today. This has been giving me headaches for more than an hour and I don't know how to solve this without having to implement a loop (which is time consuming and the opposite of elegant programming).

Given I have a set of numbers from 400 to 420 ("data"). Then there is a range, specified by the user. This range shall later become a subset of the data ("vector_subset"). Also, there is a vector with numbers to be excluded from the data ("vector_substract").

This is what I get:

data <- seq(400,420)
vector_subset <- seq(405,412)
vector_substract <- c(402,403,404,405,408,409,412,413,414)

now I want to find which elements I need to extract, because they are both in the user subset vector and the substraction vector:

intersection <- intersect(vector_subset, vector_substract)

This works just fine:

> intersection
[1] 405 408 409 412

Now I want to exclude these values from the "data" vector. But if I try this:

result <- data[-which(data==intersection)]

R tells me that

In data == intersection : longer object length is not a multiple of shorter object length

If I delete one element at a time, it works fine. Like:

result <- data[-which(data==intersection[1])]
> result
 [1] 400 401 402 403 404 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420

-> first entry of "intersection" is gone (405).

So I could implement a for-loop and delete entry by entry, but that would take too long. Is there a better way to build my desired subset?

Thanks to all helpers!

Upvotes: 0

Views: 239

Answers (1)

Colonel Beauvel
Colonel Beauvel

Reputation: 31161

Just use usual set operations:

setdiff(data, intersect(vector_subset, vector_substract))
#[1] 400 401 402 403 404 406 407 410 411 413 414 415 416 417 418 419 420

Upvotes: 2

Related Questions