Reputation: 5774
Below is the DataFrames example I want to merge.
#!/usr/bin/env python
import pandas as pd
countries = ['Germany', 'France', 'Indonesia']
rank_one = [1, 5, 7]
capitals = ['Berlin', 'Paris', 'Jakarta']
df1 = pd.DataFrame({'country': countries,
'rank_one': rank_one,
'capital': capitals})
df1 = df1[['country', 'capital', 'rank_one']]
population = ['8M', '82M', '66M', '255M']
rank_two = [0, 1, 6, 9]
df2 = pd.DataFrame({'population': population,
'rank_two': rank_two})
df2 = df2[['rank_two', 'population']]
I want to merge the both DataFrames based on an exact or approximate match.
if rank_two is equal to rank_one
OR
rank_two is the closest and next bigger number from rank_one
.
Example :
df1 :
country capital rank_one
0 Germany Berlin 1
1 France Paris 5
2 Indonesia Jakarta 7
df2 :
rank_two population
0 0 8M
1 1 82M
2 6 66M
3 9 255M
df3_result :
country capital rank_one rank_two population
0 Germany Berlin 1 1 82M
1 France Paris 5 6 66M
2 Indonesia Jakarta 7 9 255M
Upvotes: 3
Views: 1598
Reputation: 2939
You can use the pandas 'merge_asof' function
pd.merge_asof(df1, df2, left_on="rank_one", right_on="rank_two", direction='forward')
alternatively if you want to merge by closest and you don't mind if it's higher or lower you can use:
direction="nearest"
Upvotes: 2
Reputation: 323366
By using merge_asof
pd.merge_asof(df1,df2,left_on='rank_one',right_on='rank_two',direction='forward')
Out[1206]:
country capital rank_one rank_two population
0 Germany Berlin 1 1 82M
1 France Paris 5 6 66M
2 Indonesia Jakarta 7 9 255M
Upvotes: 6