Reputation: 4100
I have a starting phrase say fruits
. I have some ending phrase like apple
, banana
and pineapple
.
I have some documents with variable as text
:
Fruits
They are good for health....
should eat Apple
Fruits
eat regularly banana
Fruits you need
to eat Apple
Fruits are good
Daily we should have pineapple
In general, fruits have various minerals.
Most of them are very tasty
My Regex and code:
p = r'(\bFruits\b\s*\w*\s*\n*.*?(\bApples?\b|\bbananas?\b|\bpineapples?\b))'
sep = ";;"
lst = re.findall(p, text, re.I|re.M|re.DOTALL)
val = sep.join(str(v) for v in lst )
Above regex works well in text
1 & 2 and partially in text
3.
Problem:
All I need is when we encounter fruit and don't find any of the ending phrase, then and only then go till end of document.
Expected Output from text
3:
Fruits are good Daily we should have pineapple ;; fruits have various minerals.
Most of them are very tasty
P.S. : I tried $
as well, but that was also not working.
Upvotes: 0
Views: 43
Reputation: 3097
include \Z
in the expression as follows
text = '''Fruits are good
Daily we should have pineapple
In general, Fruits have various minerals.
Most of them are very tasty
'''
p = r'(\bFruits\b\s*\w*\s*\n*.*?(\bApples?\b|\bbananas?\b|\bpineapples?\b|\Z))'
sep = ";;"
lst = re.findall(p, text, re.I|re.M|re.DOTALL)
val = sep.join(str(v) for v in lst )
print(val)
output is as follows
('Fruits are good\n\nDaily we should have pineapple', 'pineapple');;('Fruits have various minerals.\n\nMost of them are very tasty\n', '')
[Finished in 0.1s]
Upvotes: 1