Reputation: 419
I have two dataframes
df1
KO-ST 1_UID 2_Vloge
0 1976-_ 200106897 200106897.0
1 991-_ 200108737 200108737.0
2 2147--- 200109776 200109776.0
3 2048-_ 200300912 200300912.0
4 2194-_ 200301057 200301057.0
5 2386--- 200301312 200301312.0
6 2002-_ 200301315 200301315.0
7 1324-_ 200301573 200301573.0
8 1625-45 200301868 200301868.0
9 1625-_ 200301868 200301868.0
...
df2
a b
SID KO-ST
10000002 851-601 288.0 288.0
10000003 851-1 68.0 68.0
10000328 853-103 64.5 64.5
10000583 861-25 30.1 30.1
10001002 2590-1 96.7 178.9
10001004 2593-2 349.2 349.2
10001005 2593-3 282.0 295.2
10001006 2593-4 121.5 121.5
10001008 2593-6 109.3 110.3
10001009 2593-7 9.9 9.9
...
There is more than 500.000 rows, where KO-ST
is unique and SID
can be repeated. I am trying to group them and repeat the values from columns a
and b
. Values from KO-ST are unique and in 10 % cases - not perfect and this cases (e.g. 1324-___) there will be no matches in df2.
My initial code is
REN_ES = pd.merge(df1, df2, left_index=True, on = 'KO-ST')
But i get an error:
KeyError: 'KO-ST'
Where did I get it wrong? df1 is a result from importing 2 csv files and combining and merging some values. For easier data treatment, column KO-ST
was added as a combination of two columns with function:
DS_STA['KO-ST'] =DS_STA['KO_SIFKO'].map(str) + "-" + DS_STA['STEV'].map(str)
KO-SIFKO
and STEV
are integers, hence the code. I added this beacuse I'm suspecting there's something wrong with the recognition of data types.
Upvotes: 0
Views: 108
Reputation: 2298
df2 has a multi index and "KO-ST" is not a column name. left_index=True
overwrites on'KO-ST"
so remove that. try:
REN_ES = pd.merge(df1, df2.reset_index(), on = 'KO-ST')
Upvotes: 2