Reputation: 13
Assume that we have:
"ABC_" or "CDE_" can be prefix or absent. In addition, "_DEF" can be postfix or absent.
In this case, can I extract ANY_STRING (which is just any set of characters, just a string) between prefix and postfix by using one regular expression?
For example, input = "ABC_CDE_I like an apple_DEF", then output must be "I like an apple".
I tried the following code, but it does not output what I expected.
re.compile("(?:ABC_|CDE_)*(\S+)(?:_DEF)?")
or
re.compile("(?:ABC_|CDE_)*(\S+)(?:_DEF)*")
Thanks a lot in advance for your advice.
Upvotes: 1
Views: 146
Reputation: 627600
You may use
(?:ABC_|CDE_|^)+(\S*?)(?:_DEF|$)
See the regex demo
Details
(?:
- start of a non-capturing group that matches any of the subpatterns separated with the alternation operator |
:
ABC_
- a literal substring ABC_
|
- orCDE_
- a literal substring CDE_
|
- or^
- start of string)+
- one or more consecutive occurrences, as many as possible (+
is a greedy quantifier)(\S*?)
- Capturing group 1: zero or more chars other than whitespace, but as few as possible due to the *?
lazy quantifier(?:_DEF|$)
- either _DEF
or (|
) end of string ($
).Upvotes: 2