naveen kumar
naveen kumar

Reputation: 197

Remove empty brackets ( ) from string

I had a problem in removing empty brackets from string, I tried few methods didn't work. kindly help

here is the dataframe

data = {'disc': ['( ) -2,4-dichloro-a- ( chloromethyl ) -benzenemethanol methanesulfonate','( ) ( s ) -isopropyl 2 ','( 2s3s ) -12-epoxy-3- ( boc-amino ) -4-phenylbutane ( ) boc-epoxideide']}
df1 = pd.DataFrame(data)
print(df1)

which have multiple occurrence of ( ) need to remove only empty brackets.

input:

      disc
0   ( ) -2,4-dichloro-a- ( chloromethyl ) -benzenemethanol methanesulfonate
1   ( ) ( s ) -isopropyl 2 
2   ( 2s3s ) -12-epoxy-3- ( boc-amino ) -4-p

output:

     disc
0   -2,4-dichloro-a- ( chloromethyl ) -benzenemethanol methanesulfonate
1   ( s ) -isopropyl 2 
2   ( 2s3s ) -12-epoxy-3- ( boc-amino ) -4-phenylbutane boc-epoxideide

using replace is not helping because it will remove all brackets there in the string.

Upvotes: 2

Views: 1458

Answers (3)

Daweo
Daweo

Reputation: 36510

pandas.DataFrame.replace does support using regex, so you can do:

import pandas as pd
data = {'disc': ['( ) -2,4-dichloro-a- ( chloromethyl ) -benzenemethanol methanesulfonate','( ) ( s ) -isopropyl 2 ','( 2s3s ) -12-epoxy-3- ( boc-amino ) -4-phenylbutane ( ) boc-epoxideide']}
df1 = pd.DataFrame(data)
df2 = df1.replace(r'\s*\(\s*\)\s*', '', regex=True)
print(df2)

Output:

                                                disc
0  -2,4-dichloro-a- ( chloromethyl ) -benzenemeth...
1                                ( s ) -isopropyl 2
2  ( 2s3s ) -12-epoxy-3- ( boc-amino ) -4-phenylb...

Note that you have to inform replace to use regular expression (regex=True) and I used so-called raw-string to simplify escaping, ( and ) needs to be escaped as they have special meaning in pattern, as for pattern itself I used 0 or more whitespaces (/s*) also before and after ( ) to also remove leading/trailing ones.

Upvotes: 1

Anurag Wagh
Anurag Wagh

Reputation: 1086

import re You can try using regex module

df1["disc"] = df1["disc"].str.replace("\(\\s+\)", "")

\\s+ means it will detect one or spaces between two brackets

 -2,4-dichloro-a- ( chloromethyl ) -benzenemethanol methanesulfonate
 ( s ) -isopropyl 2 
( 2s3s ) -12-epoxy-3- ( boc-amino ) -4-phenylbutane  boc-epoxideide

Upvotes: 1

Bendik Knapstad
Bendik Knapstad

Reputation: 1458

replace should work:


a="'( ) -2,4-dichloro-a- ( chloromethyl ) -benzenemethanol"

>>> a.replace("( )","")
>>> "' -2,4-dichloro-a- ( chloromethyl ) -benzenemethanol"

Upvotes: 1

Related Questions