GauravP
GauravP

Reputation: 61

How to match through new line in regular expression in python?

This is my code. I want to ignore whatever that is within ~~. Even if it contains new lines, white spaces. So that I can ignore the comments.

for letter in code :

    tok += letter                                   #adding each character to the token.

    if not is_str and (tok == " " or tok == "\n"):  
    #ignoring whitespaces and new line if it's not a string.
        tok = ""                                    #reseting each the iterator token.

    #Always always always remember. It's not lexer's job to generate errors
    #It's the work of parser. One thing should only do one thing.

    elif re.search(r'Enter', tok):
        tokens.append("ENTER")
        tok = ""

    elif re.search(r'~(.*?|\n*?)~',tok):            
    #to ignore the comments written within ~this~
        tok = ""

Upvotes: 0

Views: 132

Answers (2)

Moses Koledoye
Moses Koledoye

Reputation: 78554

You can use the re.DOTALL flag:

Make the '.' special character match any character at all, including a newline; without this flag, '.' will match anything except a newline.

pattern = re.compile(r'~(.*?)~', re.DOTALL)

Trial:

>>> import re
>>> s = '''~dksdjs
... sdjs~'''
>>> pattern = re.compile(r'~(.*?)~', re.DOTALL)
>>> pattern.search(s)
<_sre.SRE_Match object; span=(0, 13), match='~dksdjs\nsdjs~'>
#                                                    ^

Upvotes: 2

xystum
xystum

Reputation: 1009

If no other ~ is allowed within ~ strings, you can use:

r'~[^~]*~'

This will match any character but ~.

Upvotes: 2

Related Questions