Reputation: 301
My question is twofold. I want to filter a column in data frame df, based on different values. My column consists of many different car types. If I’m looking for a BMW 3er reihe for instance, I also want to include BMW 3er reihe; 3161 SEDAN.
Example dataset:
Item Brand Type
1 BMW 3er Reihe
2 BMW 3er Reihe; 3161 SEDAN
3 Audi A1
4 Audi A3
I did this with grep:
carsegmentC <- df[grep("3er Reihe|A3", df$Type), ]
This works well and filters the data frame exactly how I want it to be filtered, but this presents more difficulties for the next part of my question. Ultimately I want to put the filtered outputs into a new column and back into the data frame. So it will look like this:
Item Brand Type Carsegment C Carsegment B
1 BWM 3er Reihe 3er Reihe
2 BMW 3er reihe; 3161 SEDAN 3er reihe; 3161 SEDAN
3 Audi A1 A1
4 Audi A3 A3
This does not seem to work with grep and I’ve tried other things like copying columns, but it doesn’t work. Hopefully anyone can help, I’d appreciate it!
Reproducible example:
df <- data.frame(Item = c(1,2,3,4), Brand=c("BMW", "BMW", "Audi", "Audi"), Type=c("3er Reihe", "3er Reihe;3161 SEDAN ", "A1", "A3"))
Upvotes: 2
Views: 352
Reputation: 887158
Place the patterns in a list
, loop through the patterns, apply grepl
to get a logical index, wrap it with ifelse
to return ""
for the FALSE values in grepl
and assign it to new columns in 'df'.
df[c("CarsegmentC", "CarsegmentB")] <- lapply(list("3er Reihe|A3", "A1"),
function(pat) ifelse(grepl(pat, df$Type), df$Type, ""))
df
# Item Brand Type CarsegmentC CarsegmentB
#1 1 BMW 3er Reihe 3er Reihe
#2 2 BMW 3er Reihe;3161 SEDAN 3er Reihe;3161 SEDAN
#3 3 Audi A1 A1
#4 4 Audi A3 A3
df <- data.frame(Item = c(1,2,3,4), Brand=c("BMW", "BMW", "Audi",
"Audi"), Type=c("3er Reihe", "3er Reihe;3161 SEDAN", "A1", "A3"),
stringsAsFactors=FALSE)
Upvotes: 1