Reputation: 15
I tried merging two datasets (DataFrames) as follows:
D1 = pd.DataFrame({'Village':['Ampil','Ampil','Ampil','Bachey','Bachey','Center','Center','Center','Center'], 'Code':[123,324,190,453,321,786,456,234,987]})
D2 = pd.DataFrame({'Village':['Ampil','Ampil','Bachey','Bachey','Center','Center'],'Lat':[11.563,13.278,12.637,11.356,12.736,13.456], 'Long':[102.234,103.432,105.673,103.539,103.873,102.983]})
I want to merge the two based on the Village column. I want the output to look like the following:
D3 = pd.DataFrame({'Village': ['Ampil','Ampil','Bachey','Bachey','Center','Center'],'Code':[123,324,453,321,786,456],'Lat':[11.563,13.278,12.637,11.356,12.736,13.456], 'Long':[102.234,103.432,105.673,103.539,103.873,102.983]})
I have tried join, merge, and concat but none fit the purpose. I need a code that would apply to a larger data. Really appreciate it if some could help.
Upvotes: 1
Views: 58
Reputation: 22503
One way is to first create a running cumcount for both your initial dfs by Village
, and then merge by both Village
and count
:
df1['count'] = df1.groupby('Village').cumcount()
df2["count"] = df2.groupby('Village').cumcount()
print (df2.merge(df1,on=["Village","count"],how="left").drop("count",axis=1))
#
Village Lat Long Code
0 Ampil 11.563 102.234 123
1 Ampil 13.278 103.432 324
2 Bachey 12.637 105.673 453
3 Bachey 11.356 103.539 321
4 Center 12.736 103.873 786
5 Center 13.456 102.983 456
Upvotes: 1