Sonu Mishra
Sonu Mishra

Reputation: 1779

Regular expression substitution in Python

I have a string

line = "haha (as jfeoiwf) avsrv arv (as qwefo) afneoifew"

From this I want to remove all instances of "(as...)" using some regular expression. I want the output to look like

line = "haha avsrv arv afneoifew"

I tried:

line = re.sub(r'\(+as .*\)','',line)

But this yields:

line = "haha afneoifew"

Upvotes: 4

Views: 72

Answers (5)

ivan K.
ivan K.

Reputation: 207

Try:

re.sub(u".\(as \w+\).", ' ',line)

Upvotes: 2

stephan
stephan

Reputation: 10265

To get non-greedy behaviour, you have to use *? instead of *, ie re.sub(r'\(+as .*?\) ','',line). To get the desired string, you also have to add a space, ie re.sub(r'\(+as .*?\) ','',line).

Upvotes: 4

Tomasz Plaskota
Tomasz Plaskota

Reputation: 1367

You were very close. You need to use lazy quantifier '?' after .*. In default it will try to capture biggest group it possibly can. With lazy quantifier it'll actually try to match smallest possible groups.

line = re.sub(r'\(+as .*?\) ','',line)

Upvotes: 2

3kt
3kt

Reputation: 2553

The problem is that your regexp matches this whole group : (as jfeoiwf) avsrv arv (as qwefo), hence your result.

You can use :

>>> import re
>>> line = "haha (as jfeoiwf) avsrv arv (as qwefo) afneoifew"
>>> line = re.sub(r'\(+as [a-zA-Z]*\)','',line)
>>> line
'haha  avsrv arv  afneoifew'

Hope it'll be helpful.

Upvotes: 2

piRSquared
piRSquared

Reputation: 294258

try:

re.sub(r'\(as[^\)]*\)', '', line)

Upvotes: 1

Related Questions