Reputation: 8044
Could anyone maybe know how to extract x
and y
from this character: "x and y"
using grep
function (not using stringi
package) if x
and y
are random characters?
I am so not skilled in regular expressions.
Thanks for any response.
Upvotes: 2
Views: 151
Reputation: 70732
As @MrFlick commented, grep
is not the right function to extract these substrings.
You can use regmatches
and do something like this:
> x <- c('x and y', 'abc and def', 'foo and bar')
> regmatches(x, gregexpr('and(*SKIP)(*F)|\\w+', x, perl=T))
# [[1]]
# [1] "x" "y"
# [[2]]
# [1] "abc" "def"
# [[3]]
# [1] "foo" "bar"
Or if " and "
is always constant, then use strsplit
as suggested in the comments.
> x <- c('x and y', 'abc and def', 'foo and bar')
> strsplit(x, ' and ', fixed=T)
# [[1]]
# [1] "x" "y"
# [[2]]
# [1] "abc" "def"
# [[3]]
# [1] "foo" "bar"
Upvotes: 4
Reputation: 78792
The regex here matches any chars "and" chars and then extracts them with regmatches
:
txt <- c("x and y", "a and b", " C and d", "qq and rr")
matches <- regexec("([[:alpha:]]+)[[:blank:]]+and[[:blank:]]+([[:alpha:]]+)", txt)
regmatches(txt, matches)[[1]][2:3]
## [1] "x" "y"
regmatches(txt, matches)[[2]][2:3]
## [1] "a" "b"
regmatches(txt, matches)[[3]][2:3]
## [1] "C" "d"
regmatches(txt, matches)[[4]][2:3]
## [1] "qq" "rr"
([[:alpha:]]+)
matches one or more alpha characters and places it in a match group. [[:blank:]]+
matches one or more "whitespace" characters. There are less verbose ways to write these regexes but the expanded ones (to me) help make it easier to grok if there will be folks reading the code that aren't familiar with regexes.
I also didn't need to call regmatches
4x, but it was faster to cut/paste for a toy example.
Upvotes: 4