Reputation: 755
I have a string that contains several "\n". I would like to look at each line and remove every line that contains the word "banana"
Sample DF:
farm_data <- data.frame(shop=c('fruit'),
sentence=c('the basket contains apples
bananas are the best
are we going to eat bananas
why not just boil the fruits
let us make some banana smoothie'), stringsAsFactors=FALSE)
What I've tried:
farm_data$sentence <- gsub(".* bananas .* \n", "\n", farm_data$sentence)
What I want:
clean_data <- data.frame(shop=c('fruit'),
sentence=c('the basket contains apples
why not just boil the fruits'), stringsAsFactors=FALSE)
Lines that contain banana have been removed.
Thanks.
Upvotes: 1
Views: 160
Reputation: 1189
I address the question in perhaps a roundabout way. I first split the query by the line break character \n
.
sentence <- unlist(strsplit(as.character(farm_data$sentence), '\n'))
After that I remove those elements of the resulting split that contain the word "banana".
cleanSentence <- sentence[-which(unlist(sapply(sentence, function(x){grep('banana',x)})==1))]
Then I hammer it back together using the paste
function.
clean_data <- data.frame(shop=c('fruit'),
sentence= paste(cleanSentence, collapse=' \n'), stringsAsFactors=FALSE)
Hopefully this isn't too ham-fisted. :)
To address your concern about the usability to other "fruits" or strings:
cleanFruit <- function(fruit = 'banana'){
sentence <- unlist(strsplit(as.character(farm_data$sentence), '\n'))
cleanSentence <- sentence[-which(unlist(sapply(sentence, function(x){grep(fruit,x)})==1))]
clean_data <- data.frame(shop=c('fruit'),
sentence= paste(cleanSentence, collapse=' \n'), stringsAsFactors=FALSE)
return(clean_data)
}
Write it up into a function, and hand it a given fruit (or word). @rawr 's answer seems a bit cleaner.
Upvotes: 1
Reputation: 20811
x <- 'the basket contains apples
bananas are the best
are we going to eat bananas
why not just boil the fruits
let us make some banana smoothie'
cat(x)
# the basket contains apples
# bananas are the best
# are we going to eat bananas
# why not just boil the fruits
# let us make some banana smoothie
cat(gsub('.*banana.*\\n?', '', x, perl = TRUE))
# the basket contains apples
# why not just boil the fruits
Upvotes: 3