Reputation: 551
I have a file with the following structure:
******
Block 1
text
text
...
End
******
Block 2
text
text
...
End
******
Block 3
text
text
...
End
******
and so on. I want to open the file read each line and save the information of the first block in a string. This is what I have so far.
Block = ''
with open(File) as file:
for line in file:
if re.match('\.Block.*', line):
Block += line
if 'str' in line:
break
print (Block)
However, when I print Block I am getting:
Block 1
Block 2
...
How can I use my regex to copy the lines from Block 1 to End? Thank you
Upvotes: 0
Views: 167
Reputation: 2407
with open(File) as ff:
txt=ff.read() # reading the whole file in
re.findall(r"(?ms)^\s*Block\s*\d+.*?^\s*End\s*$",txt)
Out:
['Block 1\ntext\ntext\n...\nEnd ',
'Block 2\ntext\ntext\n...\nEnd ',
'Block 3\ntext\ntext\n...\nEnd ']
Or change '\d+' to '1' to get the 1st one.
(?ms): m: multiline mode, that we can apply ^ and $ in each line,
s: '.' matches newline,too.
?: non-greedy mode in '.*?'
Upvotes: 0
Reputation: 1
You're only matching on lines that match the regex expression '.Block.*'. If you want to assign the values from each block, you'll have to do a little bit more work.
Block = ''
Match = False
with open(File) as file:
for line in file:
if re.match('^End$', line):
Match = False
if re.match('\.Block.*', line) or Match:
Match = True
Block += line
if 'str' in line:
break
print (Block)
Upvotes: 0
Reputation: 71471
You can use itertools.groupby
:
import itertools, re
lines = [i.strip('\n') for i in open('filename.txt')]
first_result, *_ = [list(b) for a, b in itertools.groupby(lines, key=lambda x:bool(re.findall('^\*+$', x))) if not a]
print(first_result)
Output:
['Block 1', 'text', 'text', '...', 'End ']
Upvotes: 1