Reputation: 61
I have a list like this:
x = ['Las Vegas', 'San Francisco, 'Dallas']
And a dataframe that looks a bit like this:
import pandas as pd
data = [['Las Vegas (Clark County), 25], ['New York', 23],
['Dallas', 27]]
df = pd.DataFrame(data, columns = ['City', 'Value'])
I want to replace my city values in the DF "Las Vegas (Clark County)" with "Las Vegas". In my dataframe are multiple cities with different names which needs to be changed. I know I could do a regex expression to just strip off the part after the parentheses, but I was wondering if there was a more clever, generic way.
Upvotes: 1
Views: 82
Reputation: 863501
Use Series.str.extract
with join
ed values of list by |
for regex OR
and then replace non matched values to original by Series.fillna
:
df['City'] = df['City'].str.extract(f'({"|".join(x)})', expand=False).fillna(df['City'])
print (df)
City Value
0 Las Vegas 25
1 New York 23
2 Dallas 27
Another idea is use Series.str.contains
with loop, but it should be slow if large Dataframe and many values in list
:
for val in x:
df.loc[df['City'].str.contains(val), 'City'] = val
Upvotes: 2