Reputation: 469
I have a column in pandas data frame like the one shown below;
LGA
Alpine (S)
Ararat (RC)
Ballarat (C)
Banyule (C)
Bass Coast (S)
Baw Baw (S)
Bayside (C)
Benalla (RC)
Boroondara (C)
What I want to do, is to remove all the special characters from the ending of each row. ie. (S), (RC).
Desired output should be;
LGA
Alpine
Ararat
Ballarat
Banyule
Bass Coast
Baw Baw
Bayside
Benalla
Boroondara
I am not quite sure how to get desired output mentioned above.
Any help would be appreciated.
Thanks
Upvotes: 6
Views: 2225
Reputation: 15558
You can use Pandas
str.replace
…
dataf['LGA'] = dataf['LGA'].str.replace(r"\([^()]*\)", "", regex=True)
import pandas as pd
dataf = pd.DataFrame({
"LGA":\
"""Alpine (S)
Ararat (RC)
Ballarat (C)
Banyule (C)
Bass Coast (S)
Baw Baw (S)
Bayside (C)
Benalla (RC)
Boroondara (C)""".split("\n")
})
output = dataf['LGA'].str.replace(r"\([^()]*\)", "", regex=True)
print(output)
0 Alpine
1 Ararat
2 Ballarat
3 Banyule
4 Bass Coast
5 Baw Baw
6 Bayside
7 Benalla
8 Boroondara
Name: LGA, dtype: object
Upvotes: 1
Reputation: 1059
I have different approach using regex. It will delete anything between brackets:
import re
import pandas as pd
df = {'LGA': ['Alpine (S)', 'Ararat (RC)', 'Bass Coast (S)'] }
df = pd.DataFrame(df)
df['LGA'] = [re.sub("[\(\[].*?[\)\]]", "", x).strip() for x in df['LGA']] # delete anything between brackets
Upvotes: 2
Reputation: 26
import pandas as pd
df = {'LGA': ['Alpine (S)', 'Ararat (RC)', 'Bass Coast (S)'] }
df = pd.DataFrame(df)
df[['LGA','throw away']] = df['LGA'].str.split('(',expand=True)
Upvotes: 1