VKB
VKB

Reputation: 65

Stripping the last occurrence of text inside braces from a string

I would like to know how to strip the last occurrence of () and its contents given a string.

The below code strips all the () in a string.

bracketedString     = '*AWL* (GREATER) MINDS LIMITED (CLOSED)'
nonBracketedString  = re.sub("\s\(.*?\)", '', bracketedString)
print(nonBracketedString1)

I would like the following output.

*AWL* (GREATER) MINDS LIMITED

Upvotes: 3

Views: 545

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627020

You may remove a (...) substring with a leading whitespace at the end of the string only:

\s*\([^()]*\)$

See the regex demo.

Details

  • \s* - 0+ whitespace chars
  • \( - a (
  • [^()]* - 0+ chars other than ( and )
  • \) - a )
  • $ - end of string.

See the Python demo:

import re
bracketedString     = '*AWL* (GREATER) MINDS LIMITED (CLOSED)'
nonBracketedString  = re.sub(r"\s*\([^()]*\)$", '', bracketedString)
print(nonBracketedString) # => *AWL* (GREATER) MINDS LIMITED

With PyPi regex module you may also remove nested parentheses at the end of the string:

import regex
s = "*AWL* (GREATER) MINDS LIMITED (CLOSED(Jan))" # => *AWL* (GREATER) MINDS LIMITED
res = regex.sub(r'\s*(\((?>[^()]+|(?1))*\))$', '', s)
print(res)

See the Python demo.

Details

  • \s* - 0+ whitespaces
  • (\((?>[^()]+|(?1))*\)) - Group 1:
    • \( - a (
    • (?>[^()]+|(?1))* - zero or more repetitions of 1+ chars other than ( and ) or the whole Group 1 pattern
    • \) - a )
  • $ - end of string.

Upvotes: 7

mrzasa
mrzasa

Reputation: 23327

In case you want to replace last occurrence of brackets even if they are not at the end of the string:

*AWL* (GREATER) MINDS LIMITED (CLOSED) END

you can use tempered greedy token:

>>> re.sub(r"\([^)]*\)(((?!\().)*)$", r'\1', '*AWL* (GREATER) MINDS LIMITED (CLOSED) END')                        
# => '*AWL* (GREATER) MINDS LIMITED  END'  

Demo

Explanation:

  • \([^)]*\) matches string in brackets
  • (((?!\().)*)$ assures that there are no other opening bracket until the end of the string

    • (?!\() is negative lookeahead checking that there is no ( following
    • . matches next char (that cannot be ( because of the negative lookahead)
    • (((?!\().)*)$ the whole sequence is repeated until the end of the string $ and kept in a capturing group
  • we replace the match with the first capturing group (\1) that keeps the match after the brackets

Upvotes: 1

Related Questions