jadeidev
jadeidev

Reputation: 194

If-Then-Else regex statement

I am trying to form a regular expression that would capture <expression1> if it is in the string otherwise capture <expression2>.

I tried something along the lines of: (IF)(?(1)THEN|ELSE), meaning the capture would be IFTHEN (in case IF is found) or ELSE (in case IF is not found)

For example:

(apple1\d)(?(1)|apple2\d)

case1: for the string: pear33 apple14 apple24 orange22 orange44

Result would be: apple14

case2: In contrast for the string: pear33 apple24 orange22 orange44

The result would be: apple24 (since there is no apple1 it would capture apple2\d)

My regex works well for case1 it returns apple14 however the ELSE doesn't work. I expect it to return apple24 for case2

Upvotes: 3

Views: 1207

Answers (3)

FailSafe
FailSafe

Reputation: 482

To start off, I'm not sure why you'd need an if-else statement for this (See version 2 of my answer), but I'll try to provide a few solutions.

So, for me, @Barmer's solution (If-Then-Else regex statement) gave me error: bad character in group name although I'm sure with proper tweaking that may be the optimal solution.

Until he gets back, however, you can try these (although search.group() and search.groups() do annoy me a bit regarding their handling of capture groups/lack thereof)

.

VERSION 1: Ultra specific version, based on the solutions suggested above. My solution here is not desirable in my opinion.

>>> import re


>>> string1 = 'pear33 apple14 apple24 orange22 orange44'
>>> string2 = 'pear33 apple24 apple14 orange22 orange44'


>>> re.findall('(?<!apple[12]\d)[\s]+(apple1\d|apple2\d)', string1)
['apple14']
>>> re.findall('(?<!apple[12]\d)[\s]+(apple1\d|apple2\d)', string2)
['apple24']


>>> re.search('(?<!apple[12]\d)[\s]+(apple1\d|apple2\d)', string1).group()
' apple14'
>>> re.search('(?<!apple[12]\d)[\s]+(apple1\d|apple2\d)', string2).group()
' apple24'

VERSIONS 2 AND 3: Way better and more scalable versions in my opinion. I'm privy to version 2. TBH, though, this solution can lead to memory tie ups, but for short strings it will work fine

>>> string1 = 'pear33 apple14 apple24 orange22 orange44'
>>> string2 = 'pear33 apple24 apple14 orange22 orange44'


>>> re.findall('[\S\s]*?(apple[\d]+)[\S\s]*', string1)
['apple14']
>>> re.findall('[\S\s]*?(apple[\d]+)[\S\s]*', string2)
['apple24']


>>> re.findall('(?<!apple\d\d)[\S\s]+?(apple[\d]+)[\S\s]*', string1)
['apple14']
>>> re.findall('(?<!apple\d\d)[\S\s]+?(apple[\d]+)[\S\s]*', string2)
['apple24']

Upvotes: 1

Mahmoud Elshahat
Mahmoud Elshahat

Reputation: 1959

Edit: used search() instead of findall()

second example:

# with "if then else" in search string
string = 'pear33 if then else apple14'
match = re.search(r'if then|else', string)
print(match.group())

output:

if then

no "if" in search string

string = 'pear33  then else apple14'
match = re.search(r'if then|else', string)
print(match.group())

output:

else

for first example

import re 
string = 'pear33  apple24 orange22 orange44'
match = re.findall(r'(apple1\d|apple2\d)', string)
print(match)

output:

['apple24']

Upvotes: 0

Barmar
Barmar

Reputation: 781068

Use:

(?(?=apple1\d)apple1\d|apple2\d)

The IF part should be a lookahead, so it's not included in the match requirement when the ELSE branch is taken.

If you don't want to repeat the IF expression in THEN, you can use a backreference.

(?(?=(apple1\d))\1|apple2\d)

Upvotes: 1

Related Questions