Reputation: 2316
If you find the title a bit cryptic, here's what I meant: I'm looking for every pattern that starts with the hash sign (#
) and then match everything after that sign until it finds another hash or another defined entry, but the last hash nor the other entry shouldn't be part of the match.
Given this example:
#one_liner = some cr4zy && weird stuff h3r3 $% ()
#multi_liner =some other s7uff,
but put in (other) line
#one_liner_again = again, some stuff here...
LB: this line shouldn't be taken into consideration!
#multi_liner_again=You guessed:
going to another line!
<EOF>
I'd like to end up with four matches containing for example such set of tuples:
("one_liner", "some cr4zy && weird stuff h3r3 \$\% ()")
("multi_liner", "some other s7uff, but put in (other) line")
("one_liner_again", "again, some stuff here...")
("multi_liner_again", "You guessed: going to another line!")
I was trying with this pattern, but it doesn't bring what I'm looking for:
#\w.+\s*=(\s*|\S*.+)\w.+\n*\s*.+(?=#)
Upvotes: 0
Views: 66
Reputation: 862
this might get you started...you'll probably want to strip the found values of \n
and unwanted whitespace though...
import re
data = """
#one_liner = some cr4zy && weird stuff h3r3 $% ()
#multi_liner =some other s7uff,
but put in (other) line
#one_liner_again = again, some stuff here...
LB: this line shouldn't be taken into consideration!
#multi_liner_again=You guessed:
going to another line!
"""
for m in re.findall(r"#([^#]+)\s*=\s*((?:[^#](?!LB:))*)", data, re.MULTILINE|re.DOTALL):
print(m)
prints:
('one_liner ', 'some cr4zy && weird stuff h3r3 $% ()\n')
('multi_liner ', 'some other s7uff,\n but put in (other) line\n')
('one_liner_again ', 'again, some stuff here...\n ')
('multi_liner_again', 'You guessed: \ngoing to another line!\n')
Upvotes: 2