Reputation: 4040
I have the following string:
"UNKNOWN_{_requestID___b9b6bcc4-c163-45d7-82d9-423a96cf5fe1_,_deviceID___9c84f871-9e95-45d5-9335-12e7d42b96a0_}_2018-08-15-15-43-01-296_529307b7-6316-4cdc-ab53-2e1158c651c6.txt"
and I want to extract the 529307b7-6316-4cdc-ab53-2e1158c651c6
part (the last part between _ and .txt).
Here is what I am trying to do using regex:
^\_\w\.txt
but without luck, I am keep playing with this, please advise what is the strategy and how to "attack" this.
Upvotes: 2
Views: 77
Reputation: 133508
Could you please try following.
gsub(".*_|\\.txt","",x)
Output will be as follows.
[1] "529307b7-\n6316-4cdc-ab53-2e1158c651c6"
Explanation: Adding following only for explanation purposes.
gsub( ##Using gsub(Global substitution function of R to perform multiple substitution on variables)
".*_ ##Mentioning REGEX to select everything from starting till _(underscore)
| ##|(pipe) defines OR so it should match either previous or coing REGEX in varibale's value.
\\.txt" ##\\. means escaping DOT so that DOT should be treated as a DOT not with its special meaning so it should match string .txt
,"" ##If above mentioned REGEXs any one of them OR both matches then substitute them with "" means NULL.
,x) ##Mentioning variable named x on which we have to perform gsub.
Where Input variable x
's value is as follows.
x <- "UNKNOWN_{_requestID___b9b6bcc4-c163-45d7-82d9-423a96cf5fe1_,_deviceID
___9c84f871-9e95-45d5-9335-12e7d42b96a0_}_2018-08-15-15-43-01-296_529307b7-
6316-4cdc-ab53-2e1158c651c6.txt"
Upvotes: 1
Reputation: 70643
Here's using a hidden gem from tools.
x <- "UNKNOWN_{_requestID___b9b6bcc4-c163-45d7-82d9-423a96cf5fe1_,_deviceID___9c84f871-9e95-45d5-9335-12e7d42b96a0_}_2018-08-15-15-43-01-296_529307b7-6316-4cdc-ab53-2e1158c651c6.txt"
out <- strsplit(x, "_")[[1]]
out <- out[length(out)]
tools::file_path_sans_ext(out)
[1] "529307b7-6316-4cdc-ab53-2e1158c651c6"
Upvotes: 2
Reputation: 658
apply 2 times sub:
text <- c("UNKNOWN_{_requestID___b9b6bcc4-c163-45d7-82d9-423a96cf5fe1_,_deviceID___9c84f871-9e95-45d5-9335-12e7d42b96a0_}_2018-08-15-15-43-01-296_529307b7-6316-4cdc-ab53-2e1158c651c6.txt" )
sub("\\.txt.*", "", sub(".*\\_", "", text))
Upvotes: 1
Reputation: 626794
You may use
sub("^.*_(.*)\\.txt$", "\\1", x)
See the regex demo
sub
will perform a single seasrch and replace operation. It will find a match if the string conforms to the following:
^
start of string.*_
- any 0+ chars, as many as possible, up to the last _
(.*)
- any 0+ chars (captured into Group 1,later referred to with \1
from the replacement pattern), as many as possible, up to and including...\\.txt$
- .txt
(.
must be escaped to match a literal dot) at the end of the string ($
).x <- "UNKNOWN_{_requestID___b9b6bcc4-c163-45d7-82d9-423a96cf5fe1_,_deviceID___9c84f871-9e95-45d5-9335-12e7d42b96a0_}_2018-08-15-15-43-01-296_529307b7-6316-4cdc-ab53-2e1158c651c6.txt"
sub("^.*_(.*)\\.txt$", "\\1", x)
## => [1] "529307b7-6316-4cdc-ab53-2e1158c651c6"
Upvotes: 3