despo
despo

Reputation: 3

replace string comprising specific text + changing pattern in r

I'm trying to remove a specific pattern followed by changing combination of digits or letters in an R script.

Pattern to be removed: " Alpha code for WIS - Info Only - see journal XXXX"

where XXXX can be a 4-digit number, a combination of a letter + 3-digit number or 3 letters.

I've tried already:

str_replace(x, '^\\s "Alpha code for WIS - Info Only - see journal" \\b[A-Z1-9]{4}\\b','') 

str_replace(x, '^\\s "Alpha code for WIS - Info Only - see journal" ([0-9])','')  

str_replace(x, '^\\sAlpha code for WIS - Info Only - see journal ([0-9]+)','') 

None of these work. I've also tried similar regex with gsub, and again I didn't go any further.

I could go in 3 steps, replacing first the 4-digit number, then the letter combination and finally the alphanumeric, if it's easier.

Upvotes: 0

Views: 229

Answers (1)

Giuseppe Ricupero
Giuseppe Ricupero

Reputation: 6272

Try a regex like this with gsub:

"Alpha code for WIS - Info Only - see journal ([0-9]{4}|[a-zA-Z][0-9]{3}|[a-zA-Z]{3})

So the snippet of code will be:

test <- "Line1: Alpha code for WIS - Info Only - see journal 1234\nLine2: Alpha code for WIS - Info Only - see journal A123\nLine3: Alpha code for WIS - Info Only - see journal AbC\nLine4: line 4 content"
result <- gsub("Alpha code for WIS - Info Only - see journal ([0-9]{4}|[a-zA-Z][0-9]{3}|[a-zA-Z]{3})", '', test)
print(result)

Output

[1] "Line1: \nLine2: \nLine3: \nLine4: line 4 content"

Upvotes: 1

Related Questions