blue-sky
blue-sky

Reputation: 53826

Regex to ignore data between brackets

I replace characters { , } , : , , with an empty string using below:

This code :

s = "\":{},"
print(s)
print(re.sub(r'\"|{|}' , "",s))

prints:

":{},
:,

which is expected.

I'm attempting to modify the regex to ignore everything between open and closed brackets. So for the string "\":{},[test,test2]" just :,[test,test2] should be returned.

How to modify the regex such that data contained between [ and ] is not applied by the regex.

I tried using:

s = "\":{},[test1, test2]"
print(s)
print(re.sub(r'[^a-zA-Z {}]+\"|{|}' , "",s))

(src: How to let regex ignore everything between brackets?)

None of the , values are replaced .

Upvotes: 3

Views: 1358

Answers (2)

anubhava
anubhava

Reputation: 785296

Assuming your brackets are balanced/unescaped, you may use this regex with a negative lookahead to assert that matched character is not inside [...]:

>>> import re
>>> s = "\":{},[test1,test2]"
>>> print (re.sub(r'[{}",](?![^[]*\])', '', s))
:[test1,test2]

RegEx Demo

RegEx Details:

  • [{}",]: Match one of those character inside [...]
  • (?![^[]*\]): Negative lookahead to assert that we don't have a ] ahead of without matching any [ in between, in other words matched character is not inside [...]

Upvotes: 2

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626952

If you want to remove the {, }, , and " not inside square brackets, you can use

re.sub(r'(\[[^][]*])|[{}",]', r'\1', s)

See the regex demo. Note you can add more chars to the character set, [{}"]. If you need to add a hyphen, make sure it is the last char in the character set. Escape \, ] (if not the first, right after [) and ^ (if it comes first, right after [).

Details:

  • (\[[^][]*]) - Capturing group 1: a [...] substring
  • | - or
  • [{}",] - a {, }, , or " char.

See a Python demo using your sample input:

import re
s = "\":{},[test1, test2]"
print( re.sub(r'(\[[^][]*])|[{}",]', r'\1', s) )
## => :[test1, test2]

Upvotes: 1

Related Questions