Reputation: 862
I want to extract substring (description details) from the following strings:
string1 <- @{self=https://somesite.atlassian.net/rest/api/2/status/1; description=The issue is open and ready for the assignee to start work on it.; iconUrl=https://somesite.atlassian.net/images/icons/statuses/open.png; name=Open; id=1; statusCategory=}
string2 <- @{self=https://somesite.atlassian.net/rest/api/2/status/10203; description=; iconUrl=https://somesite.atlassian.net/images/icons/statuses/generic.png; name=Full Curation; id=10203; statusCategory=}
I am trying to get the following
ExtractedSubString1 = "The issue is open and ready for the assignee to start work on it."
ExtractedSubString2 = ""
I tried this:
library(stringr)
ExtractedSubString1 <- substr(string1, str_locate(string1, "description=")+12, str_locate(string1, "; iconUrl")-1)
ExtractedSubString2 <- substr(string2, str_locate(string2, "description=")+12, str_locate(string2, "; iconUrl")-1)
Looking for a better way to accomplish this.
Upvotes: 0
Views: 204
Reputation: 1378
You could try:
test.1 <- gsub("description=", "", strsplit(string1, "; ")[[1]][2])
test.2 <- gsub("description=", "", strsplit(string2, "; ")[[1]][2])
This simply splits the string on ;
which divides each string in to 6 elements the square brackets select the 2nd element and the gsub replaces the description=
to nothing to remove it.
Upvotes: 1
Reputation: 38500
Using only base R's sub
and back referencing, you could do
sub(".*description=(.*?);.*", "\\1", c(string1, string2))
[1] "The issue is open and ready for the assignee to start work on it." ""
The ".*"
match any set of characters, "description="
is a literal match, ".*?"
matches any set of characters, but the ?
forces a lazy match rather than a greedy match. ";"
is a literal, and the "()"
capture the sub-expression that is lazily matched. The back reference "\\1"
returns the sub-expression captured in the parentheses.
Using the base R functions regexec
and regmatches
gets a bit closer to the method in the OP. sapply
with "["
is then used to extract the desired result.
sapply(regmatches(c(string1, string2),
regexec(".*description=(.*?);.*", c(string1, string2))),
"[", 2)
[1] "The issue is open and ready for the assignee to start work on it." ""
Upvotes: 2