Reputation: 25
df1
| location | latitude
0 | NaN | 41.0
1 | NaN | 43.0
2 | NaN | 72.0
3 | NaN | 74.0
df2
| location | min_latitude | max_latitude
0 | point_a | 40.0 | 45.0
1 | point_b | 70.0 | 75.0
How can I update the location
values in df1
based on the condition that latitude
is between min_latitude
and max_latitude
just like the result below?
df1
| location | latitude
0 | point_a | 41.0
1 | point_a | 43.0
2 | point_b | 72.0
3 | point_b | 74.0
I have tried using df1['location'].apply(lambda x: df2['location'] if df1['location'].between(df2['min_latitude'], df2['max_latitude']) else df1['location'])
, but it returns ValueError: Can only compare identically-labeled Series objects
.
Upvotes: 0
Views: 43
Reputation: 340
You don't need to use between, you can use boolean indexing and get what you want with:
for idx in df2.index:
df1.loc[(df1["latitude"] > df2["min_latitude"][idx]) &
(df1["latitude"] < df2["max_latitude"][idx]), "location"] = df2["location"][idx]
It's a complicated looking one liner, but it's just a couple of comparisons to get the rows you want. Using loc here is important, otherwise pandas won't assign anything due to chaining
Upvotes: 1