Reputation: 3
I would like to exclude observations that include specific values. That is, the observations in my columns look like this: 501.512.518. These represent three different individuals per observation. Now I want to exclude all observations that include e.g. the individual 512
Is there a way to create sub-samples that can exclude observations which include the value 512 but do not equal or start with the value 512?
Upvotes: 0
Views: 1564
Reputation: 1246
The answer provide by @steveb is great, but you may want a base R alternative:
Example with mtcars dataset
df <- data.frame(Group = rep(c('A', 'B', 'C', 'D'), 50),
Number = sample(500:600, 200, replace = T))
to_drop <- c(512)
df <- df[!(df$Number %in% to_drop),]
> df
Group Number
1 A 518
2 B 536
3 C 518
4 D 505
5 A 544
6 B 511
7 C 507
Upvotes: 0
Reputation: 598
This question lacks some details, still I hope this could help. I would used the grepl function.
In order to remove all rows in the dataset that (one of its) columns (denoted as col) contain (but don't start with) 512, do:
newDF <- oldDT[!grepl('.512', oldDF$col), ]
grepl('.512', oldDF$col) will return a logical vector with TRUE every-time the column "col" has the pattern ".512" in it.
The ! infront of it will negate it and thus remove those rows.
Hope it helps.
Upvotes: 1
Reputation: 5532
It would be helpful if you had a more specific example but making some assumptions, and using dplyr
you could do something like the following:
exclusion_values <- c(501, 512, 518)
new_df <- old_df %>% filter(! col_of_interest %in% exclusion_values)
Is this what you are looking for ?
Upvotes: 0