Matching a string that's arbitrarily splits over multiple lines

Question

Is there a way in regex's to match a string that is arbitrarily split over multiple lines - say we have the following format in a file:

msgid "This is "
"an example string"
msgstr "..."

msgid "This is an example string"
msgstr "..."

msgid ""
"This is an " 
"example" 
" string"
msgstr "..."

msgid "This is " 
"an unmatching string" 
msgstr "..."

So we would like to have a pattern that would match all the example strings, ie: match the string regardless of how it's split across lines. Notice that we are after a specific string as shown in the sample, not just any string. So in this case we would like to match the string "This is an example string".

Of course we can can easily concat the strings then apply the match, but got me wondering if this is possible. I'm talking Python regex's but a general answer is ok.

pwuertz · Accepted Answer

Do you want to match a series of words? If so, you could look for words with just spaces (\s) in between, since \s matches newlines and spaces alike.

import re

search_for = "This is an example string"
search_for_re = r"\b" + r"\s+".join(search_for.split()) + r"\b"
pattern = re.compile(search_for_re)
match = lambda s: pattern.match(s) is not None

s = "This is an example string"
print match(s), ":", repr(s)

s = "This is an 
 example string"
print match(s), ":", repr(s)

s = "This is 
 an unmatching string"
print match(s), ":", repr(s)

Prints:

True : 'This is an example string'
True : 'This is an 
 example string'
False : 'This is 
 an unmatching string'

Matching a string that's arbitrarily splits over multiple lines

Answers (2)

Related Questions

Matching a string that&#39;s arbitrarily splits over multiple lines

Answers (2)

Related Questions

Matching a string that's arbitrarily splits over multiple lines