Reputation: 197
I had a problem in removing empty brackets from string, I tried few methods didn't work. kindly help
here is the dataframe
data = {'disc': ['( ) -2,4-dichloro-a- ( chloromethyl ) -benzenemethanol methanesulfonate','( ) ( s ) -isopropyl 2 ','( 2s3s ) -12-epoxy-3- ( boc-amino ) -4-phenylbutane ( ) boc-epoxideide']}
df1 = pd.DataFrame(data)
print(df1)
which have multiple occurrence of ( )
need to remove only empty brackets.
input:
disc
0 ( ) -2,4-dichloro-a- ( chloromethyl ) -benzenemethanol methanesulfonate
1 ( ) ( s ) -isopropyl 2
2 ( 2s3s ) -12-epoxy-3- ( boc-amino ) -4-p
output:
disc
0 -2,4-dichloro-a- ( chloromethyl ) -benzenemethanol methanesulfonate
1 ( s ) -isopropyl 2
2 ( 2s3s ) -12-epoxy-3- ( boc-amino ) -4-phenylbutane boc-epoxideide
using replace is not helping because it will remove all brackets there in the string.
Upvotes: 2
Views: 1458
Reputation: 36510
pandas.DataFrame.replace does support using regex, so you can do:
import pandas as pd
data = {'disc': ['( ) -2,4-dichloro-a- ( chloromethyl ) -benzenemethanol methanesulfonate','( ) ( s ) -isopropyl 2 ','( 2s3s ) -12-epoxy-3- ( boc-amino ) -4-phenylbutane ( ) boc-epoxideide']}
df1 = pd.DataFrame(data)
df2 = df1.replace(r'\s*\(\s*\)\s*', '', regex=True)
print(df2)
Output:
disc
0 -2,4-dichloro-a- ( chloromethyl ) -benzenemeth...
1 ( s ) -isopropyl 2
2 ( 2s3s ) -12-epoxy-3- ( boc-amino ) -4-phenylb...
Note that you have to inform replace to use regular expression (regex=True
) and I used so-called raw-string to simplify escaping, (
and )
needs to be escaped as they have special meaning in pattern, as for pattern itself I used 0 or more whitespaces (/s*
) also before and after (
)
to also remove leading/trailing ones.
Upvotes: 1
Reputation: 1086
import re You can try using regex module
df1["disc"] = df1["disc"].str.replace("\(\\s+\)", "")
\\s+
means it will detect one or spaces between two brackets
-2,4-dichloro-a- ( chloromethyl ) -benzenemethanol methanesulfonate
( s ) -isopropyl 2
( 2s3s ) -12-epoxy-3- ( boc-amino ) -4-phenylbutane boc-epoxideide
Upvotes: 1
Reputation: 1458
replace should work:
a="'( ) -2,4-dichloro-a- ( chloromethyl ) -benzenemethanol"
>>> a.replace("( )","")
>>> "' -2,4-dichloro-a- ( chloromethyl ) -benzenemethanol"
Upvotes: 1