To capture message from a character after specific texts

Question

I have a following character in R. Is there way to populate only text coming after [SQ].

Input

df   # df is a character
[1] "[Mi][OD][SQ]Nice message1."                        
[2] "[Mi][OD][SQ]Nice message2."                         
[3] "[RO] ERROR: Could not SQLExecDirect 'SELECT * FROM "

Expected output

df
[1] Nice message1. Nice message2

In case there are more [SQ] like below

df   # df is a character
[1] "[Mi][OD][SQ]Nice message1."                        
[2] "[Mi][OD][SQ]Nice message2."                         
[3] "[RO] ERROR: Could not SQLExecDirect 'SELECT * FROM "
[4] "[Mi][OD][SQ]Nice message3."

Expected output

df
[1] Nice message1. Nice message2. Nice message3

akrun · Accepted Answer

An option is to use str_extract to extract the substring and then wrap with na.omit to remove the NA elements which occur when there is no match for a string. Here, we use a regex lookaround to check the pattern [SQ] that precedes other characters to extract those characters that are succeeding it

library(stringr)
as.vector(na.omit( str_extract(df, "(?<=$$SQ$$).*")))
#[1] "Nice message1" "Nice message2" "Nice message3"

If it needs to be a single string, then str_c to collapse the strings

str_c(na.omit( str_extract(df,  "(?<=$$SQ$$).*")), collapse = '. ')
#[1] "Nice message1. Nice message2. Nice message3"

data

df <- c("[Mi][OD][SQ]Nice message1.", "[Mi][OD][SQ]Nice message2.", 
"[RO] ERROR: Could not SQLExecDirect 'SELECT * FROM ", "[Mi][OD][SQ]Nice message3."
)

To capture message from a character after specific texts

Answers (1)

data

Related Questions