Reputation: 606
I have a test string that looks like
These are my food preferences mango and I also like bananas and I like grapes too.
I am trying to write a regex in python to return the text with such rules:
My current expression is: (live: https://regex101.com/r/1CSSNc/1/ )
(?P<Start>\bpreferences\b)(?:\s*(?:(?P<Name>\w*)\s*){1,7}like)*?(\s*(?P<Last>\w*\s*){1,7})
which returns
Match 1 18-64 preferences mango and I also like bananas and
Group Start 18-29 preferences
Group 3 29-64 mango and I also like bananas and
Group Last 60-64 and
I expected/wanted the output to be:
Match 1 18-64 preferences mango .. grapes too
Group Start 18-29 preferences
Group 3 29-64 mango and I also
Group 4 xx xx bananas and I
Group Last 60-64 grapes too
My implementation is missing some concepts here.
Upvotes: 0
Views: 66
Reputation: 626903
You can use
(?P<Start>\bpreferences\b)(?P<Mid>(?:\s+\w+(?:\s+\w+){0,6}?\s+like)+)(?:\s+(?P<Last>\w+(?:\s+\w+){1,7}))?
See the regex demo.
Details:
(?P<Start>\bpreferences\b)
- Group "Start": a whole word preferences
(?P<Mid>(?:\s+\w+(?:\s+\w+){0,6}?\s+like)+)
- Group "Mid": one or more repetitions of
\s+
- one or more whitespaces\w+(?:\s+\w+){0,6}?
- one or more word chars and then zero to six occurrences of one or more whitespaces and then one or more word chars, as few as possible\s+like
- one or more whitespaces and then the word like
(?:\s+(?P<Last>\w+(?:\s+\w+){1,7}))?
- an optional occurrence of
\s+
- one or more whitespaces(?P<Last>\w+(?:\s+\w+){1,7})
- Group "Last": one or more word chars and then one to seven occurrences of one or more whitespaces and one or more word charsSee the Python demo:
import re
text = "These are my food preferences mango and I also like bananas and I like grapes too."
pattern = r"(?P<Start>\bpreferences\b)(?P<Mid>(?:\s+\w+(?:\s+\w+){0,6}?\s+like)+)(?:\s+(?P<Last>\w+(?:\s+\w+){1,7}))?"
match = re.search(pattern, text)
if match:
print(match.group("Start"))
print( re.split(r"\s*\blike\b\s*", match.group("Mid").strip()) )
print(match.group("Last"))
Output:
preferences
['mango and I also', 'bananas and I', '']
grapes too
Upvotes: 1