caniero
caniero

Reputation: 25

Removing a part of the string starting and ending with regular expression in R

I have a set of strings starting with [ANONYMOUS], and ending with ; sign. I'd like to remove this part from the set of strings.

For example, I have this string:

"[ANONYMOUS],1756 , An Intro, V19;BIAN C, 2016, WINIT, V7, P83;"

and I'd like to have this:

"BIAN C, 2016, WINIT, V7, P83;"

Upvotes: 1

Views: 889

Answers (2)

Chris Ruehlemann
Chris Ruehlemann

Reputation: 21400

You can use str_extract and lookbehind and lookahead:

library(stringr)
str_extract(str1, "(?<=;).*(?=;)")
[1] "BIAN C, 2016, WINIT, V7, P83"

This pattern matches anything (.*) between a semicolon to the left ((?<=;))--the lookbehind--and a semicolon to the right ((?=;))--the lookahead.

Upvotes: 0

akrun
akrun

Reputation: 887118

We could use sub to match the pattern [ (it is a metacharacter so we escape \\), from the start (^) of the string followed by the string 'ANONYMOUS', then the closing bracket ] and one or more characters that are not a ; ([^;]+), and replace with blank ("")

sub("^\\[ANONYMOUS\\][^;]+;", "", str1)

data

str1 <- '[ANONYMOUS],1756 , An Intro, V19;BIAN C, 2016, WINIT, V7, P83;'

Upvotes: 1

Related Questions