Help Needed
Help Needed

Reputation: 11

Regex Simple Multiline Question

pattern = '^([R][uU][nN])(.*)\((.*)\).'

Essentially what I am trying to do with this is find procedure calls in the data I have in the form:

RUN proc-name-can-be-anything (input parameters, input, output).

can also be:

RUN proc-name-long
                  ( 
                    input parameter,
                    other parameter,
                    one_more_just_because 
                  ).

The pattern works fine for any single line instances, I need to be able to accommodate multi-line and have been tearing my hair out as I dive back into python and regular expressions (I know the pattern is only for single line at the moment because "." cannot be a newline). All I really care about is the parameters of the procedure call I am looking at.

Thanks

Upvotes: 1

Views: 120

Answers (2)

eyquem
eyquem

Reputation: 27575

import re

ss = '''RUN proc-name ( input parameter,
                        other parameter,
                        one_more_just_because ).'''

regx = re.compile('^(R[Uu][Nn]) +(.+?) *\((.*?)\)\.',re.MULTILINE|re.DOTALL)

print regx.search(ss).groups()

If you don't want to use re.DOTALL, you can also do:

import re

ss = '''RUN proc-name ( input parameter,
                        other parameter,
                        one_more_just_because ).'''


regx = re.compile('^(R[Uu][Nn]) +(.+?) *\(([\s\S]*?)\)\.',re.MULTILINE)

print regx.search(ss).groups()

Upvotes: 1

JAB
JAB

Reputation: 21079

http://docs.python.org/library/re.html#re.S
http://docs.python.org/library/re.html#re.DOTALL

Make the '.' special character match any character at all, including a newline

And you can include this in your regex by prepending the pattern string with '(?s)'

Also, don't forget that you might want to use the raw string syntax with pattern strings, just in case. I don't think you'll have any problems with your example as given, but it's always a good idea to use them just as a self-reminder.

So something like pattern = r'(?s)^([R][uU][nN])(.*)\((.*)\).' should do the trick.

Upvotes: 3

Related Questions