Veraaa
Veraaa

Reputation: 301

Subset dataframe and put items in new column in R

My question is twofold. I want to filter a column in data frame df, based on different values. My column consists of many different car types. If I’m looking for a BMW 3er reihe for instance, I also want to include BMW 3er reihe; 3161 SEDAN.

Example dataset:

Item   Brand   Type
1      BMW     3er Reihe
2      BMW     3er Reihe; 3161 SEDAN
3      Audi    A1 
4      Audi    A3

I did this with grep:

carsegmentC <- df[grep("3er Reihe|A3", df$Type), ]

This works well and filters the data frame exactly how I want it to be filtered, but this presents more difficulties for the next part of my question. Ultimately I want to put the filtered outputs into a new column and back into the data frame. So it will look like this:

Item Brand  Type                      Carsegment C             Carsegment B
1   BWM     3er Reihe                 3er Reihe 
2   BMW     3er reihe; 3161 SEDAN     3er reihe; 3161 SEDAN
3   Audi    A1                                                  A1 
4   Audi    A3                        A3            

This does not seem to work with grep and I’ve tried other things like copying columns, but it doesn’t work. Hopefully anyone can help, I’d appreciate it!

Reproducible example:

df <- data.frame(Item = c(1,2,3,4), Brand=c("BMW", "BMW", "Audi", "Audi"), Type=c("3er Reihe", "3er Reihe;3161 SEDAN ", "A1", "A3"))

Upvotes: 2

Views: 352

Answers (1)

akrun
akrun

Reputation: 887158

Place the patterns in a list, loop through the patterns, apply grepl to get a logical index, wrap it with ifelse to return "" for the FALSE values in grepl and assign it to new columns in 'df'.

df[c("CarsegmentC", "CarsegmentB")] <- lapply(list("3er Reihe|A3", "A1"), 
       function(pat) ifelse(grepl(pat, df$Type), df$Type, ""))

df
#  Item Brand                 Type          CarsegmentC CarsegmentB
#1    1   BMW            3er Reihe            3er Reihe            
#2    2   BMW 3er Reihe;3161 SEDAN 3er Reihe;3161 SEDAN            
#3    3  Audi                   A1                               A1
#4    4  Audi                   A3                   A3            

data

df <- data.frame(Item = c(1,2,3,4), Brand=c("BMW", "BMW", "Audi", 
     "Audi"), Type=c("3er Reihe", "3er Reihe;3161 SEDAN", "A1", "A3"), 
     stringsAsFactors=FALSE)

Upvotes: 1

Related Questions