Reputation: 1

Changing the DataFrame using a regular expression

I just started studying pandas and I had a question: I need to use a regular expression to find strings of a certain pattern and replace the entire string with this pattern. I used :

df.replace(to_replace=r'^[Gg]eramny', value='Berlin', regex=True)

Source string: 'Platz der Republik 1, 10557 Berlin, germany'

My code outputs: 'Platz der Republik 1, 10557 Berlin, Berlin'

But I need the line: 'Berlin'

Upvotes: 0

Answers (3)

Boris Silantev

Reputation: 763

You should fix the regular expression and show that it also includes arbitrary beginning and the end. Try this:

df.replace(to_replace=r'.*[Gg]ermany.*', value='Berlin', regex=True)

I would also consider using IGNORECASE flag to make sure that any representation of the country name will be detected:

df.replace(to_replace=r'.*(?i:germany).*', value='Berlin', regex=True)

Alternative approach is to use search() method:

df.applymap(lambda x: 'Berlin' if re.search(r'[Gg]ermany', x) else x)

Upvotes: 1

Shah

Reputation: 130

Consider new data have multiple string under array so we will do exactly similar but just show data type of dataframe.

import pandas as pd
import re                   

d={'Array':['Platz der Republik 1, 10557 Berlin, germany','Platz der Republik 1, 10557 Berlin, germany','Platz der Republik 1, 10557 Berlin, germany']}

#create Dataframe
dd = pd.DataFrame(d)
print(dd)

result = re.sub(r'[Gg]ermany','Berlin',str(dd))
print(result)

Upvotes: 0

Shah

Reputation: 130

consider, result = d.replace(r'[Gg]ermany','Berlin',regex=true) this command through we cant use regular expression. error will become,The error was caused because we passed keyword arguments to the replace() method.

The replace() method takes only positional arguments. so we need import re for regex

import pandas as pd
import re                   

d='Platz der Republik 1, 10557 Berlin, germany'
print(d)


result = re.sub(r'[Gg]ermany','Berlin',d)
print(result)

Output :

HOPE IT HELP YOU

Upvotes: 0

Changing the DataFrame using a regular expression

Answers (3)

Related Questions