Maya
Maya

Reputation: 43

Need to subset by excluding multiple values in a categorical variable

I have a categorical field, and I want to subset by 'excluding' multiple values.

Initially, I had assumed I could just list out all the values I want directly into the code, or create a separate list and add it back into the code ( see below).

subset(data, data$variable != c("x1", "x2", "x3"))

or

Exclude_Prod = c("x1", "x2", "x3")

subset(data, data$variable != Exclude_Prod)

I have multiple values in a single field, which is a categorical variable.

I want to exclude these multiple values and then subset the data. The reason why I want to exclude is because there are less values compared to the ones I want to keep.

Upvotes: 1

Views: 6472

Answers (3)

PavoDive
PavoDive

Reputation: 6496

a data.table way:

require(data.table)
setDT(data)[! variable %in% c("x1", "x2", "x3"), ]

Please notice that naming a data frame "data" is bad idea, as there's a function called data in the utils package.

Upvotes: 0

Maya
Maya

Reputation: 43

Thank you, Nelson. After hard searching, getting help, and trial and error I used tidyverse:

data2 <- data1 %>%
  filter(variable != "x1" & variable != "x2")

Upvotes: 0

NelsonGon
NelsonGon

Reputation: 13319

Try this: Replace with relevant variables. data3 is the dataset.

library(dplyr)

Using some fake data: With base R

data3[!data3$Exclude_Prod%in%c("x1","x2"),]

The "disadvantage" is that base R preserves the original indexing. With dplyr

data3<-data.frame(Sales=c(11,12,13),Exclude_Prod = c("x1", "x2", "x3"))
data3 %>% 
  filter(!Exclude_Prod%in%c("x1","x2"))

Result:

 Sales Exclude_Prod
1    13           x3

Original Answer:

 mtcars %>% 
      mutate(ID=row.names(.)) %>% 
      select(ID) %>% 
      filter(!ID%in%c("Volvo 142E","Toyota Corona"))#eg Variable%in%c("x1", "x2", "x3)

Upvotes: 3

Related Questions