Reputation: 1
I just started studying pandas and I had a question: I need to use a regular expression to find strings of a certain pattern and replace the entire string with this pattern. I used :
df.replace(to_replace=r'^[Gg]eramny', value='Berlin', regex=True)
Source string: 'Platz der Republik 1, 10557 Berlin, germany'
My code outputs: 'Platz der Republik 1, 10557 Berlin, Berlin'
But I need the line: 'Berlin'
Upvotes: 0
Views: 64
Reputation: 763
You should fix the regular expression and show that it also includes arbitrary beginning and the end. Try this:
df.replace(to_replace=r'.*[Gg]ermany.*', value='Berlin', regex=True)
I would also consider using IGNORECASE flag to make sure that any representation of the country name will be detected:
df.replace(to_replace=r'.*(?i:germany).*', value='Berlin', regex=True)
Alternative approach is to use search()
method:
df.applymap(lambda x: 'Berlin' if re.search(r'[Gg]ermany', x) else x)
Upvotes: 1
Reputation: 130
Consider new data have multiple string under array so we will do exactly similar but just show data type of dataframe.
import pandas as pd
import re
d={'Array':['Platz der Republik 1, 10557 Berlin, germany','Platz der Republik 1, 10557 Berlin, germany','Platz der Republik 1, 10557 Berlin, germany']}
#create Dataframe
dd = pd.DataFrame(d)
print(dd)
result = re.sub(r'[Gg]ermany','Berlin',str(dd))
print(result)
Upvotes: 0
Reputation: 130
consider, result = d.replace(r'[Gg]ermany','Berlin',regex=true) this command through we cant use regular expression. error will become,The error was caused because we passed keyword arguments to the replace() method.
The replace() method takes only positional arguments. so we need import re for regex
import pandas as pd
import re
d='Platz der Republik 1, 10557 Berlin, germany'
print(d)
result = re.sub(r'[Gg]ermany','Berlin',d)
print(result)
HOPE IT HELP YOU
Upvotes: 0