Reputation: 1101
I need to remove a series of substrings from a vector, that might or might not occur multiple times. The substrings to drop are less than the number of strings.
I would like to use a loop but gsub seems to fail in a for loop.
drop <- c("red ","blue ","yellow ")
auto <- data.frame(entry=c("red car","red yellow car","car"))
for(i in 1:length(drop)){
auto$entry_simple <- gsub(drop[i],"",auto$entry)
}
The loops works only for the last entry how? This is the result
entry entry_simple
1 red car red car
2 red yellow car red car
3 car car
Instead of
entry entry_simple
1 red car car
2 red yellow car car
3 car car
Upvotes: 2
Views: 66
Reputation: 887028
We can use str_remove
which is vectorized
library(dplyr)
library(stringr)
auto %>%
mutate(entry_simple = str_remove(entry, drop))
# entry entry_simple
#1 red car car
#2 blue car car
#3 yellow car car
If we look at the loop, the gsub
is done on the entire column 'entry' and the output is assigned to 'entry_simple' i.e. in each iteration, the 'entry_simple' is gettting changed
lapply(drop, function(x) gsub(x, "", auto$entry))
#[[1]]
#[1] "car" "blue car" "yellow car"
#[[2]]
#[1] "red car" "car" "yellow car"
#[[3]]
#[1] "red car" "blue car" "car"
leaving the last one i.e. 'red car' 'blue car', 'car' as the final output
It seems the OP wanted to replace for each corresponding row. In that case, just use the index on the 'x' value for gsub
and on the lhs of <-
auto$entry_simple <- auto$entry
for(i in seq_along(drop)) auto$entry_simple[i] <- gsub(drop[i], "", auto$entry[i])
auto
# entry entry_simple
#1 red car car
#2 blue car car
#3 yellow car car
Based on the OP's updated post
auto$entry_simple <- auto$entry
for(i in 1:length(drop)) auto$entry_simple <- gsub(drop[i],"",auto$entry_simple)
Upvotes: 2
Reputation: 1101
This works. Is it all down to "seq_along"?
for(i in seq_along(drop)) auto$entry <- gsub(drop[i], "",auto$entry)
Upvotes: 0