How to use REGEX with multiline

Question

The following expression works well extracting the portion of data string that starts with the word Block followed by open bracket { and ending with the closing bracket '}':

data ="""
Somewhere over the rainbow
Way up high 
Block {
 line 1
 line 2
 line 3
}
And the dreams that you dreamed of
Once in a lullaby
"""
regex = re.compile("""(Block\ {
\ [^\{\}]*
}
)""", re.MULTILINE)
result = regex.findall(data)
print result

which returns:

['Block {
 line 1
 line 2
 line 3
}
']

But if there is another curly bracket inside of the Block portion of the string the expression breaks returning an empty list:

data ="""
Somewhere over the rainbow
Way up high 
Block {
 line 1
 line 2
 {{}
 line 3
}
And the dreams that you dreamed of
Once in a lullaby
Block {
 line 4
 line 5
 {{
 }
 line 6
}
Somewhere over the rainbow
Blue birds fly
And the dreams that you dreamed of
Dreams really do come true ooh oh
"""

How to modify this regex expression to make it ignore the brackets that are inside of the Blocks and yet each block is returned as the separate entity in result list (so each Block could be accessed separately)?

cchamberlain · Accepted Answer

Wouldn't this work?

regex = re.compile("""(Block\ { \ [^\}]* } )""", re.MULTILINE)

In the version you've posted, it is exiting the match whenever it comes across a second opening brace, even though you want it to exit upon the first closing brace. If you want nested opening / closing braces that's another story.

How to use REGEX with multiline

Answers (2)

Related Questions