Reputation: 332
I have a following character in R. Is there way to populate only text coming after [SQ].
Input
df # df is a character
[1] "[Mi][OD][SQ]Nice message1."
[2] "[Mi][OD][SQ]Nice message2."
[3] "[RO] ERROR: Could not SQLExecDirect 'SELECT * FROM "
Expected output
df
[1] Nice message1. Nice message2
In case there are more [SQ] like below
df # df is a character
[1] "[Mi][OD][SQ]Nice message1."
[2] "[Mi][OD][SQ]Nice message2."
[3] "[RO] ERROR: Could not SQLExecDirect 'SELECT * FROM "
[4] "[Mi][OD][SQ]Nice message3."
Expected output
df
[1] Nice message1. Nice message2. Nice message3
Upvotes: 1
Views: 28
Reputation: 887691
An option is to use str_extract
to extract the substring and then wrap with na.omit
to remove the NA
elements which occur when there is no match for a string. Here, we use a regex lookaround to check the pattern [SQ]
that precedes other characters to extract those characters that are succeeding it
library(stringr)
as.vector(na.omit( str_extract(df, "(?<=\\[SQ\\]).*")))
#[1] "Nice message1" "Nice message2" "Nice message3"
If it needs to be a single string, then str_c
to collapse the strings
str_c(na.omit( str_extract(df, "(?<=\\[SQ\\]).*")), collapse = '. ')
#[1] "Nice message1. Nice message2. Nice message3"
df <- c("[Mi][OD][SQ]Nice message1.", "[Mi][OD][SQ]Nice message2.",
"[RO] ERROR: Could not SQLExecDirect 'SELECT * FROM ", "[Mi][OD][SQ]Nice message3."
)
Upvotes: 1