Reputation: 573
I have these two dataframes:
DF1=
Inflow
0 9810998109
1 5591255912
2 7394273942
3 7866678666
4 1820118202
5 9812198109
6 9810998101
7 4304043040
8 9810998121
DF2=
Inflow mi_to_zcta5
0 3371433756 11.469054
1 1790118201 24.882142
I'd like to do an operation where I can merge these two dataframes based in the 'Inflow' column. Sort of like trying to recreate a VLookUp Excel function with approximate matching (like its shown in this question). But I get a failure each time. The line I've been trying to use for this purpose is this one:
test = pd.merge_asof(DF1, DF2, on = 'mi_to_zcta5')
I've tried using the additional settings like 'allow_exact_matches' set to 'False' but I've not succeeded.
This is the error I get:
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas\_libs\index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc (pandas\_libs\index.c:5280)
File "pandas\_libs\index.pyx", line 154, in pandas._libs.index.IndexEngine.get_loc (pandas\_libs\index.c:5126)
File "pandas\_libs\hashtable_class_helper.pxi", line 1210, in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas\_libs\hashtable.c:20523)
File "pandas\_libs\hashtable_class_helper.pxi", line 1218, in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas\_libs\hashtable.c:20477)
KeyError: 'mi_to_zcta5'
I'd like to get a dataframe with 10 rows, with the "Inflow" column and an additional column of 'mi_to_zcta5' with the correspondant nearest value (if that's possible). Just like one would do in VLookUp in excel using approximate match.
Thanks in advance!
Upvotes: 0
Views: 3923
Reputation: 8816
This is your solution:
In your first dataFrame(df1) you have only column while Second(df2) has two, while doing pd.merge
you have to choose outer
, which is a union of the keys. This means all of the indexes are shown and where it has missing cols it keeps them as NaN.
>>> df1
Inflow
0 9810998109
1 5591255912
2 7394273942
3 7866678666
4 1820118202
5 9812198109
6 9810998101
7 4304043040
8 9810998121
>>> df2
Inflow mi_to_zcta5
0 3371433756 11.469054
1 1790118201 24.882142
>>>
>>>
>>>
>>> pd.merge( df1, df2, on=['Inflow'], how='outer')
Inflow mi_to_zcta5
0 9810998109 NaN
1 5591255912 NaN
2 7394273942 NaN
3 7866678666 NaN
4 1820118202 NaN
5 9812198109 NaN
6 9810998101 NaN
7 4304043040 NaN
8 9810998121 NaN
9 3371433756 11.469054
10 1790118201 24.882142
Note: you can not merge on the Key 'mi_to_zcta5
as this is not present on df
Upvotes: 2