maddie
maddie

Reputation: 1954

Python Regular Expression: Multiline pattern match with more than two substrings

I want to use a regex to find merge conflicts in a file.

I've found previous posts that show how to find a pattern that matches this structure

FIRST SUBSTRING 
/* several 
    new 
     lines 
*/
SECOND SUBSTRING

which works with the following regex: (^FIRST SUBSTRING)(.+)((?:\n.+)+)(SECOND SUBSTRING)

However, I need to match this pattern:

FIRST SUBSTRING 
/* several 
    new 
     lines 
*/
SECOND SUBSTRING
/* several 
    new 
     lines 
*/
THIRD SUBSTRING

Where first, second and third substrings are <<<<<<<, =======, >>>>>>> respectively.

I gave (^<<<<<<<)(.+)((?:\n.+)+)(=======)(.+)((?:\n.+)+)(>>>>>>) a shot but it does not work, which you can see on this demo ((^<<<<<<<)(.+)((?:\n.+)+)(=======) does work but it is not exactly what I am looking for)

Upvotes: 1

Views: 106

Answers (2)

Cappa
Cappa

Reputation: 117

Setting the flag s (single line - dot matches newline) is needed to match the text from the structure. So you can use .*? for select multi line text overriding \n, until the next pattern (? lazy mode). With this setting, the regex below matches what you need.

(<{7})(.*)(={7})(.*?)(>{7})(.*?\n)

Upvotes: 0

MikeMajara
MikeMajara

Reputation: 962

Your expression does work with a couple of slight changes. Lengths of characters do not exactly match. And You are asking for at least one character after the SECOND SUBSTRING with (.+), when there are none in the text.

(<<<<<<<)(.+)((?:\n.+)+)(=======)(.*)((?:\n.+)+)(>>>>>>>)

From then onwards it makes groups as you expect (which the answer in the comments does not). You probably want to distinguish between your and their code.

Plus, if you have to choose among working expressions, I would choose yours instead of the options proposed for readability. Regex are not friendly things to read, and using repetitions (among other sophistications) make the code harder to read. This also goes for the ?:, just query specific groups, there is no need to avoid group creation there.

Upvotes: 2

Related Questions