chromebookdev
chromebookdev

Reputation: 494

Regex Expression to remove some whitespace with a look ahead and behind

I'm working through a dataframe in python and cleaning up records. There are some with store numbers and slashes and whitespace that I need to remove. Leaving only a name and suburb.

An example of the text I'm working with is below:

Storename (Suburb / 1234     )
Storename (Surbub Suburb / 1234      )

I'm trying to get the regex to remove the spaces behind the closing bracket, but only up to the letters.

With the net result becoming:

Storename (Suburb)
Storename (Suburb)

I've been able to get the slash and numbers out with this:

test.LocationName.str.replace('[/0-9]','',regex=True)

But can't decode the regex to remove that whitespace behind the closing parenthesis.

Upvotes: 0

Views: 86

Answers (2)

Chris
Chris

Reputation: 29742

Use re.sub:

re.sub("\((\S+).+?\)", "(\\1)", "Storename (Suburb / 1234     )")
re.sub("\((\S+).+?\)", "(\\1)", "Storename (Surbub Suburb / 1234      )")

Output:

'Storename (Suburb)'
'Storename (Surbub)'

Upvotes: 0

Jan
Jan

Reputation: 43169

You might use

test.LocationName.str.replace('\s*/\s*\d+\s*','',regex=True)

See a demo on regex101.com.

Upvotes: 1

Related Questions