Reputation: 367
I have this string :
string <-"DIS_S_CD_EFS-NO_PCI-CD_ACT_CG-SOM_MT_ECT_CVE"
I need to extract only SOM_MT_ECT_CVE
from it.
So for me the key word is SOM
(identify its position ).
I tried using this :
d <-substr(gregexpr(pattern ='SOM',"DIS_S_CD_EFS-NO_PCI-CD_ACT_CG-SOM_MT_ECT_CVE"),
nchar("DIS_S_CD_EFS-NO_PCI-CD_ACT_CG-SOM_MT_ECT_CVE"),"DIS_S_CD_EFS-NO_PCI-CD_ACT_CG-SOM_MT_ECT_CVE")
But it return NA values.
Upvotes: 2
Views: 5402
Reputation: 887118
One option is sub
to match characters (.*
) until 'SOM', capture the 'SOM' to the rest of the characters in a group ((...)
) and in the replacement use the backreference (\\1
) of the captured group
sub(".*(SOM_.*)", "\\1", string)
#[1] "SOM_MT_ECT_CVE"
Or using stringr
library(stringr)
str_extract(string, "SOM.*")
#[1] "SOM_MT_ECT_CVE"
Upvotes: 2
Reputation: 51592
You can split on the hyphen and get the last word, i.e.
tail(strsplit(string, '-', fixed = TRUE)[[1]], 1)
#[1] "SOM_MT_ECT_CVE"
Or with word
from stringr
,
stringr::word(string, -1, sep = '-')
#[1] "SOM_MT_ECT_CVE"
Upvotes: 1