Jay
Jay

Reputation: 1422

python string manipulation

I have a string s with nested brackets: s = "AX(p>q)&E((-p)Ur)"

I want to remove all characters between all pairs of brackets and store in a new string like this: new_string = AX&E

i tried doing this:

p = re.compile("\(.*?\)", re.DOTALL)
new_string = p.sub("", s)

It gives output: AX&EUr)

Is there any way to correct this, rather than iterating each element in the string?

Upvotes: 6

Views: 1789

Answers (8)

noname
noname

Reputation: 1

this is just how you do it:

# strings
# double and single quotes use in Python
"hey there! welcome to CIP"   
'hey there! welcome to CIP'  
"you'll understand python"          
'i said, "python is awesome!"'      
'i can\'t live without python'      
# use of 'r' before string
print(r"\new code", "\n")    

first = "code in"
last = "python"
first + last     #concatenation

# slicing of strings

user = "code in python!"

print(user)
print(user[5])   # print an element 
print(user[-3])  # print an element from rear end
print(user[2:6]) # slicing the string
print(user[:6])  
print(user[2:])
print(len(user))   # length of the string
print(user.upper()) # convert to uppercase
print(user.lstrip())
print(user.rstrip())
print(max(user)) # max alphabet from user string
print(min(user)) # min alphabet from user string
print(user.join([1,2,3,4]))

input()

Upvotes: 0

Kobi
Kobi

Reputation: 138147

Another simple option is removing the innermost parentheses at every stage, until there are no more parentheses:

p = re.compile("\([^()]*\)")
count = 1
while count:
    s, count = p.subn("", s)

Working example: http://ideone.com/WicDK

Upvotes: 6

jfs
jfs

Reputation: 414915

You could use re.subn():

import re

s = 'AX(p>q)&E((-p)Ur)'
while True:
    s, n = re.subn(r'\([^)(]*\)', '', s)
    if n == 0:
        break
print(s)

Output

AX&E

Upvotes: 1

Kobi
Kobi

Reputation: 138147

You can use PyParsing to parse the string:

from pyparsing import nestedExpr
import sys

s = "AX(p>q)&E((-p)Ur)"
expr = nestedExpr('(', ')')
result = expr.parseString('(' + s + ')').asList()[0]
s = ''.join(filter(lambda x: isinstance(x, str), result))
print(s)

Most code is from: How can a recursive regexp be implemented in python?

Upvotes: 2

Achim
Achim

Reputation: 15722

Nested brackets (or tags, ...) are something that are not possible to handle in a general way using regex. See http://www.amazon.de/Mastering-Regular-Expressions-Jeffrey-Friedl/dp/0596528124/ref=sr_1_1?ie=UTF8&s=gateway&qid=1304230523&sr=8-1-spell for details why. You would need a real parser.

It's possible to construct a regex which can handle two levels of nesting, but they are already ugly, three levels will already be quite long. And you don't want to think about four levels. ;-)

Upvotes: 2

arussell84
arussell84

Reputation: 2543

Yeah, it should be:

>>> import re
>>> s = "AX(p>q)&E(qUr)"
>>> p = re.compile("\(.*?\)", re.DOTALL)
>>> new_string = p.sub("", s)
>>> new_string
'AX&E'

Upvotes: 3

Daniel Kluev
Daniel Kluev

Reputation: 11335

>>> import re
>>> s = "AX(p>q)&E(qUr)"
>>> re.compile("""\([^\)]*\)""").sub('', s)
'AX&E'

Upvotes: 4

ghostdog74
ghostdog74

Reputation: 343211

You can just use string manipulation without regular expression

>>> s = "AX(p>q)&E(qUr)"
>>> [ i.split("(")[0] for i in s.split(")") ]
['AX', '&E', '']

I leave it to you to join the strings up.

Upvotes: 5

Related Questions