Reputation: 194
I am trying to form a regular expression that would capture <expression1>
if it is in the string otherwise capture <expression2>
.
I tried something along the lines of: (IF)(?(1)THEN|ELSE)
, meaning the capture would be IFTHEN
(in case IF
is found) or ELSE
(in case IF
is not found)
For example:
(apple1\d)(?(1)|apple2\d)
case1:
for the string: pear33 apple14 apple24 orange22 orange44
Result would be: apple14
case2:
In contrast for the string: pear33 apple24 orange22 orange44
The result would be: apple24
(since there is no apple1
it would capture apple2\d
)
My regex works well for case1 it returns apple14
however the ELSE
doesn't work. I expect it to return apple24
for case2
Upvotes: 3
Views: 1207
Reputation: 482
To start off, I'm not sure why you'd need an if-else statement for this (See version 2 of my answer), but I'll try to provide a few solutions.
So, for me, @Barmer's solution (If-Then-Else regex statement) gave me error: bad character in group name
although I'm sure with proper tweaking that may be the optimal solution.
Until he gets back, however, you can try these (although search.group() and search.groups() do annoy me a bit regarding their handling of capture groups/lack thereof)
.
VERSION 1: Ultra specific version, based on the solutions suggested above. My solution here is not desirable in my opinion.
>>> import re
>>> string1 = 'pear33 apple14 apple24 orange22 orange44'
>>> string2 = 'pear33 apple24 apple14 orange22 orange44'
>>> re.findall('(?<!apple[12]\d)[\s]+(apple1\d|apple2\d)', string1)
['apple14']
>>> re.findall('(?<!apple[12]\d)[\s]+(apple1\d|apple2\d)', string2)
['apple24']
>>> re.search('(?<!apple[12]\d)[\s]+(apple1\d|apple2\d)', string1).group()
' apple14'
>>> re.search('(?<!apple[12]\d)[\s]+(apple1\d|apple2\d)', string2).group()
' apple24'
VERSIONS 2 AND 3: Way better and more scalable versions in my opinion. I'm privy to version 2. TBH, though, this solution can lead to memory tie ups, but for short strings it will work fine
>>> string1 = 'pear33 apple14 apple24 orange22 orange44'
>>> string2 = 'pear33 apple24 apple14 orange22 orange44'
>>> re.findall('[\S\s]*?(apple[\d]+)[\S\s]*', string1)
['apple14']
>>> re.findall('[\S\s]*?(apple[\d]+)[\S\s]*', string2)
['apple24']
>>> re.findall('(?<!apple\d\d)[\S\s]+?(apple[\d]+)[\S\s]*', string1)
['apple14']
>>> re.findall('(?<!apple\d\d)[\S\s]+?(apple[\d]+)[\S\s]*', string2)
['apple24']
Upvotes: 1
Reputation: 1959
Edit: used search() instead of findall()
second example:
# with "if then else" in search string
string = 'pear33 if then else apple14'
match = re.search(r'if then|else', string)
print(match.group())
output:
if then
no "if" in search string
string = 'pear33 then else apple14'
match = re.search(r'if then|else', string)
print(match.group())
output:
else
for first example
import re
string = 'pear33 apple24 orange22 orange44'
match = re.findall(r'(apple1\d|apple2\d)', string)
print(match)
output:
['apple24']
Upvotes: 0
Reputation: 781068
Use:
(?(?=apple1\d)apple1\d|apple2\d)
The IF
part should be a lookahead, so it's not included in the match requirement when the ELSE
branch is taken.
If you don't want to repeat the IF
expression in THEN
, you can use a backreference.
(?(?=(apple1\d))\1|apple2\d)
Upvotes: 1