Revious
Revious

Reputation: 8156

Regular expression to match speaker1: a \n speaker1: b → speaker1: a b

This was my initial attempt

^(speaker1|speaker2): (.*?)[\n\s\r]+\k<1>: 

But it doesn't work in this kind of cases:

speaker1: kskdjsk speaker2: 223 speaker1: fkjfdsj

because the regex (.*?) continue to look until it finds the 3rd line.

So I tried adding a negative lookbehind (?<!^(Io|Lei):)

^(speaker1|speaker2): (.*?(?<!^(speaker1|speaker2):))[\n\s\r]+\k<1>: 

But it doesn't work.

Upvotes: 0

Views: 26

Answers (1)

trincot
trincot

Reputation: 350760

You can check at each character position whether there is not a match with a speaker pattern:

\bspeaker[12]:((?:(?!\bspeaker[12]).)*)

Having said that, if your programming environment allows for it, it will be more efficient to match only speaker[12]: and split the input text by the matches.

Upvotes: 1

Related Questions