user3075641
user3075641

Reputation: 21

How to use positive lookbehind with If-Then-Else regex in Python

I'm trying to combine a positive lookbehind with the If-Then-Else syntax for regex in Python.

What I'm trying to do is parse through some data and I need to use two different markers to split the string.

An example of what I'm trying to do: If data = "(I want) some ice cream". Then I want to split the string up after (I want). At the same time, I might get data = "I want some ice cream". In which case, I want to split the string up after I.

The problem I'm facing is that I can't use the first white space as a for-sure way of finding where to separate because there's a white space in (I want).

Using concepts from here http://www.regular-expressions.info/conditional.html, I want to create a If-Then-Else regex with a lookbehind on whether the string starts with ( or not.

Here's what I have so far:

(?(?<=(^\())(^(.*?)\)|^(.*?)( ))

If string starts with "(", then match until the first ). Else match until the first space. This doesn't work, however.

Upvotes: 2

Views: 755

Answers (2)

user557597
user557597

Reputation:

Your assertion is misplaced here because you haven't actuall moved over the first parenthesis. Something like this is more appropriate.

 # ^((?:\([^)]*\)|\S*))


 ^ 
 (                             # (1)
      (?:
           \( [^)]* \)
        |  \S* 
      )
 )

Since it is at the beginning of the string that is in question, if it were a conditional it should be a lookahead assertion condition.

 #  ^((?(?=\()\([^)]*\)|\S*))

      ^ 
 1    (
 c         (?(?= \( )
                \( [^)]* \)    # yes, its a parenth, match '(..)'
             |  
                \S*            # no, match until first space
           )
 1    )

For @hwnd. I liked your commented regex I wanted to see it via RegexFormat app.
(Looks good!!)

 ^                # the beginning of the string
 (                # (1 start), group and capture to \1:
      (?:              # group, but do not capture:
           \(               # '('
           [^)]*            # any character except: ')' (0 or more times)
           \)               # ')'
        |                   # OR
           \S+              # non-whitespace (all but \n, \r, \t, \f, and " ") 
      )                # end of grouping
 )                # (1 end), end of \1

Upvotes: 1

hwnd
hwnd

Reputation: 70732

If string starts with ( then match until the first ). Else match until the first space. This doesn't work..

I really see no need to use the If-Then-Else conditional here, you could do something like this.

^((?:\([^)]*\)|\S+))

Regular expression:

^              the beginning of the string
(              group and capture to \1:
 (?:           group, but do not capture:
  \(           '('
  [^)]*        any character except: ')' (0 or more times)
  \)           ')'
   |           OR
   \S+         non-whitespace (all but \n, \r, \t, \f, and " ") 
  )            end of grouping
 )             end of \1

See Live demo

Upvotes: 1

Related Questions