Reputation: 8350
My string looks like :
[abc]
line_one xxxxxxxxxxxxxx
line_two xxxxxxxxxxxxxx
[pqr]
line_four xxxxxxxxxxxxxx
line_five xxxxxxxxxxxxxx
[xyz]
line_six xxxxxxxxxxxxxx
line_seven xxxxxxxxxxxxxx
I am trying to fetch these lines section wise. tried below regular expressions but no luck.
result = re.compile(r'(\[.+\])')
details = result.findall(string)
with this i am getting section names, then i tried :
result = re.compile(r'(\[.+\]((\n)(.+))+)')
Any suggestion??
Upvotes: 1
Views: 211
Reputation: 89639
With split:
re.split(r'\n*(?=\[)', s)
or
re.split(r'(?m)\n*^(?=\[)', s)
Upvotes: 1
Reputation: 67998
(\[[^\]]*\][^\[]+)(?:\s|$)
Try this.See demo.This will give you the lines section wise.
http://regex101.com/r/mP1wO4/1
import re
p = re.compile(ur'(\[[^\]]*\][^\[]+)(?:\s|$)')
test_str = u"[abc]\nline_one xxxxxxxxxxxxxx\nline_two xxxxxxxxxxxxxx\n[pqr]\nline_four xxxxxxxxxxxxxx\nline_five xxxxxxxxxxxxxx\n[xyz]\nline_six xxxxxxxxxxxxxx\nline_seven xxxxxxxxxxxxxx"
re.findall(p, test_str)
Upvotes: 1
Reputation: 174874
Use re.findall
function. You need to include \n
inside the positive lookahead , so that it won't newline character which was present just before to the []
block.
>>> m = re.findall(r'(?s)(?:^|\n)(\[[^\]]*\].*?)(?=\n\[[^\]]*\]|$)', s)
>>> m
['[abc]\nline_one xxxxxxxxxxxxxx\nline_two xxxxxxxxxxxxxx', '[pqr]\nline_four xxxxxxxxxxxxxx\nline_five xxxxxxxxxxxxxx', '[xyz]\nline_six xxxxxxxxxxxxxx\nline_seven xxxxxxxxxxxxxx']
>>> for i in m:
print(i)
[abc]
line_one xxxxxxxxxxxxxx
line_two xxxxxxxxxxxxxx
[pqr]
line_four xxxxxxxxxxxxxx
line_five xxxxxxxxxxxxxx
[xyz]
line_six xxxxxxxxxxxxxx
line_seven xxxxxxxxxxxxxx
Upvotes: 1