Trewq
Trewq

Reputation: 3237

multiline regex

I am having some trouble searching for a multiline pattern using regex. Here is the sample multiline string:

some command [first line]\n
second line \n
yes can have multiple lines\n
\n
something else that I do not care about.

Here is what I have tried so far:

>>> match = re.match(r"^(.+)\n((.*\n)*)\n",body,re.MULTILINE)
>>> match.groups()
('some command [first line]', 'second line \nyes can have multiple lines\n', 'yes can have multiple lines\n')

I am looking for match.group(1) and match.group(2), and I am happy with them, but it is bugging me that I get match.group(3) which I do not expect (and makes me this that my regexp is not right).

Also, I do not seem to get named patterns right..

 match = re.match(r"^(.+)\n((?P<bd>.*\n)*)\n",body,re.MULTILINE)
 >>> match.group(bd)
 Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 NameError: name 'bd' is not defined

I went through the Python Regular Expressions from Google, but it is obvious that I have not gotten the complete picture yet.

Upvotes: 3

Views: 533

Answers (1)

stema
stema

Reputation: 93046

Did I understand you right, that the result that you expect is in group 3 instead of group2?

If that is your problem, you can make groups non-capturing by putting a ?: at the start like this

re.match(r"^(.+)\n(?:(.*\n)*)\n",body,re.MULTILINE)

With this you will get only two groups in the result.

Maybe I got you wrong and you want to get rid of the group 3, then

re.match(r"^(.+)\n((?:.*\n)*)\n",body,re.MULTILINE)

would be the solution.

Named groups

You can access your named group like this

m.group('bd')

you need to give group() either a integer or a string as argument, see MatchObject

Upvotes: 4

Related Questions