Reputation: 8156
This was my initial attempt
^(speaker1|speaker2): (.*?)[\n\s\r]+\k<1>:
But it doesn't work in this kind of cases:
speaker1: kskdjsk speaker2: 223 speaker1: fkjfdsj
because the regex (.*?) continue to look until it finds the 3rd line.
So I tried adding a negative lookbehind (?<!^(Io|Lei):)
^(speaker1|speaker2): (.*?(?<!^(speaker1|speaker2):))[\n\s\r]+\k<1>:
But it doesn't work.
Upvotes: 0
Views: 26
Reputation: 350760
You can check at each character position whether there is not a match with a speaker pattern:
\bspeaker[12]:((?:(?!\bspeaker[12]).)*)
Having said that, if your programming environment allows for it, it will be more efficient to match only speaker[12]:
and split the input text by the matches.
Upvotes: 1