Reputation: 1769
I am trying to remove pattern from string with gsub from the following array of chr
articles<-c("RT @name1: hello world", "@nickname1: bye bye guys",
"RT @name2_surname2: I have no text", "Hello!")
The pattern is formed by the terms between @
and :
only in the strings that begin with RT
. Hence in our case the pattern is:
"name1" "name2_surname2"
The pattern can be obtained by using
pat <- "^RT.*?@(.*?):.*"
res <- gsub(pat,"\\1",articles[grepl(pat,articles)])
After removal of this pattern, the desired result is so:
"RT : hello world", "@nickname1: bye bye guys",
"RT : I have no text", "Hello!"
Anyway, when I use:
gsub(res,"",articles)
I obtain a wrong result:
[1] "RT @: hello world" "@nick: bye bye guys"
[3] "RT @name2_surname2: I have no text" "Hello!"
Warning message:
In gsub(res, "", articles) :
argument 'pattern' has length > 1 and only the first element will be used
Upvotes: 1
Views: 474
Reputation: 21400
If the desired output is, as stated, this:
"RT : hello world", "@nickname1: bye bye guys", "RT : I have no text", "Hello!"
then this solution works:
First, you need to change the pattern to include @
in the capturing group:
pat <- "^RT.*?(@.*?):.*"
res <- gsub(pat,"\\1",articles[grepl(pat,articles)])
Then, as suggested by @Akrun, you can paste the two vector elements of res
together, which allows you to use it as a (single) pattern:
gsub(paste0(res, collapse = "|"), "", articles)
That will give you the disired output.
Upvotes: 2
Reputation: 887028
We can paste
the patterns to a single string and use that in gsub
pattern as the pattern
argument is not vectorized i.e, it takes only a length of 1
gsub(paste0("\\b(", paste(res, collapse="|"), ")\\b"), "", articles)
#[1] "RT @: hello world" "@nickname1: bye bye guys" "RT @: I have no text" "Hello!"
Upvotes: 0