Reputation: 398
I want to do something like this:
for row in df:
if row['Country'] == 'unknown':
row['Country'] = city2country_mapping[row['city']]
Country and City are columns.
'city2country_mapping' is a dictionary where key:value pair is 'city':'country'.
(basically i'm trying to fill in the unknowns by getting country from dictionary as i know city for each row)
Upvotes: 2
Views: 2351
Reputation: 8458
Editing specific rows: DataFrame.loc
vs. Series.where
The standard option for editing specific rows (a "slice") of a DataFrame
object is .loc
.
The accepted answer uses a neat application of pandas.Series.where
to rewrite the df.Country
Series, which is more succinct for this specific task.
Recoding values: .apply
vs. .map
You can use .map()
to recode a Series directly with a dictionary - no need to .apply()
a lambda function.
Example
# Example data
df = pd.DataFrame({'Country': ['unknown', 'USA', 'unknown', 'UK', 'USA', 'unknown'],
'City': ['London', 'New York', 'New York', 'London', 'New York', 'Paris']
})
city2country_mapping = {'London': 'UK', 'New York': 'USA', 'Paris': 'France'}
# print(df)
# Country City
# 0 unknown London
# 1 USA New York
# 2 unknown New York
# 3 UK London
# 4 USA New York
# 5 unknown Paris
df.loc[df.Country == 'unknown', 'Country'] = df[df.Country == 'unknown'].City.map(city2country_mapping)
print(df)
Output:
Country City
0 UK London
1 USA New York
2 USA New York
3 UK London
4 USA New York
5 France Paris
Upvotes: 1
Reputation: 2533
You can do this using apply
:
df['Country'] = df.apply(lambda row: city2country_mapping[row['city']]
if row['Country'] == 'unknown' else row['Country'], axis=1)
Lambda returns city from mapping in case of 'unknown' country and otherwise just a country in this row.
Upvotes: 2
Reputation: 215047
You can vectorize this with pandas.Series.where
:
df['country'] = df.country.where(
df.country != 'unknown', df.city.map(city2country_mapping))
df.city.map(city2country_mapping)
will first create a Series that contains the corresponding country for each city, and then use this to replace the unknown
countries in the country
column.
Upvotes: 2