Rahul Agarwal
Rahul Agarwal

Reputation: 4100

Regex starting from phrase to end of doc with condition

I have a starting phrase say fruits. I have some ending phrase like apple , banana and pineapple.

I have some documents with variable as text:

  1. Fruits

    They are good for health....

    should eat Apple

  2. Fruits

    eat regularly banana

    Fruits you need

    to eat Apple

  3. Fruits are good

    Daily we should have pineapple

    In general, fruits have various minerals.

    Most of them are very tasty

My Regex and code:

p = r'(\bFruits\b\s*\w*\s*\n*.*?(\bApples?\b|\bbananas?\b|\bpineapples?\b))'
sep = ";;"
lst = re.findall(p, text, re.I|re.M|re.DOTALL)
val = sep.join(str(v) for v in lst )

Above regex works well in text 1 & 2 and partially in text 3.

Problem:

All I need is when we encounter fruit and don't find any of the ending phrase, then and only then go till end of document.

Expected Output from text 3:

Fruits are good Daily we should have pineapple ;; fruits have various minerals.
Most of them are very tasty

P.S. : I tried $ as well, but that was also not working.

Upvotes: 0

Views: 43

Answers (1)

Prince Francis
Prince Francis

Reputation: 3097

include \Z in the expression as follows

text = '''Fruits are good

Daily we should have pineapple

In general, Fruits have various minerals.

Most of them are very tasty
'''

p = r'(\bFruits\b\s*\w*\s*\n*.*?(\bApples?\b|\bbananas?\b|\bpineapples?\b|\Z))'
sep = ";;"
lst = re.findall(p, text, re.I|re.M|re.DOTALL)
val = sep.join(str(v) for v in lst )
print(val)

output is as follows

('Fruits are good\n\nDaily we should have pineapple', 'pineapple');;('Fruits have various minerals.\n\nMost of them are very tasty\n', '') [Finished in 0.1s]

Upvotes: 1

Related Questions