Reputation: 419
I'm trying to change the values in a single column using pandas apply(). My function partially worked, but I'm stuck on how to fix this other half.
Data Column:
County Name
Riverside County
San Diego County
SanFrancisco County/city
I'm trying to get rid of " County" so I'm left with just the name of the county. I successfully got rid of the " County" with the function but I'm having trouble removing the " County/city" from San Francisco.
Code:
def modify_county(countyname):
if "/city" in countyname:
return countyname.replace(" County/city","")
return countyname.replace(" County","")
lfd["CountyName"] = lfd["CountyName"].apply(modify_county)
Output:
CountyName
Riverside
San Diego
San Francisco County/city
Is something wrong with the conditional in the function?
Upvotes: 0
Views: 54
Reputation: 419
@jpp, used suggestion to apply to entire column. Not sure if this is the best way, but it worked.
lfd["CountyName"] = pd.Series(lfd["CountyName"])
lfd["CountyName"] = lfd["CountyName"].apply(lambda x: ' '.join([w for w in x.split() if not 'County' in w]))
Upvotes: 0
Reputation: 1035
@jpp's answer is the literal way of doing what you asked for. But in this case I would use pandas.Series.replace
and some regex to substitute the entire thing in one go:
import pandas as pd
s = pd.Series(['Riverside County', 'San Diego County', 'SanFrancisco County/city'])
res = s.replace(' County(/city)?', '', regex=True)
Upvotes: 1
Reputation: 164663
This is an alternative way. It works with the data you have provided.
import pandas as pd
s = pd.Series(['Riverside County', 'San Diego County', 'SanFrancisco County/city'])
res = s.apply(lambda x: ' '.join([w for w in x.split() if not 'County' in w]))
print(res)
# 0 Riverside
# 1 San Diego
# 2 SanFrancisco
# dtype: object
Upvotes: 1