Reputation: 51
When a sentence in my_file
begins with a word followed by a digit, such as "City1", and there is another sentence in my_file
beginning with "City2", the following code only returns the first sentence:
description = re.findall("\n"+i+"[\s\,\d\(].*\.\n", my_file) #i equals 'City'
if description:
for d in description:
d = d.replace('\n', ' ')
bufferlist.append(d)
bufferlist[:] = unique( bufferlist ) #unique is a function removing duplicates from a list while keeping its order
my_string = ' '.join(bufferlist)
del bufferlist[:]
else:
my_string = '0'
Why can't I get both the first and the second sentence in my_string
?
EDIT
The problem, or a part of it, was del bufferlist[:]
. This prevented the desired pile-up of matches in every iteration. The bufferlist
has to be deleted after the loop.
Upvotes: 1
Views: 1055
Reputation: 67968
(?:^|(?<=\n))City[\s\,\d\(].*\.(?=\n|$)
Try this.You are consuming \n
which might not be leaving it for others to match.
See demo.
https://regex101.com/r/VIXyar/1
Upvotes: 1