Reputation: 33
Please, I want to use gsub to extract strings from this vector:
x<-("Prayer: Lord. Have mercy on.")
The desired output is "Lord" and "Have mercy on" separately.
I tried gsub('.*:(.*)','\\1',x)
but it doesn't give them separately.
Upvotes: 3
Views: 1423
Reputation: 626758
If you need to get these two values separately, you can use
x <- c("Prayer: Lord. Have mercy on.")
gsub("^[^:]*:\\s*([^.]+).*","\\1",x)
## => [1] "Lord"
gsub("^[^:]*:\\s*[^.]+\\.\\s*([^.]+).*","\\1",x)
## => [1] "Have mercy on"
See the R demo online, regex #1 and regex #2 demos. It does not matter if you use sub
or gsub
with these regexps, they will work the same, although sub
is more logical as all you need is replace the whole string with the value of the first capturing group.
Details
^
- start of string[^:]*
- zero or more chars other than :
:
- a colon\s*
- zero or more whitespaces[^.]+
- one or more chars other than a dot\.
- a dot\s*
- zero or more whitespaces([^.]+)
- Capturing group 1: one or more chars other than dots.*
- the rest of the string.Upvotes: 1
Reputation: 521103
You could try splitting on \.\s*
after stripping off the leading Prayer:
term.
x <- "Prayer: Lord. Have mercy on."
parts <- strsplit(sub("^\\w+:\\s*", "", x), "\\.\\s*")[[1]]
parts
[1] "Lord" "Have mercy on"
Upvotes: 3