Reputation: 3710
I have issues with the merging of two large Dataframes since the merge returns NaN values though there are fitting values. The two dfs are shaped like:
df1
Motor
2232
1524
2230
2230
2224
1516
1724
2224
1524
1624
1724
2224
2224
1524
1524
1516
1524
2224
1624
1724
1724
2224
2224
df2
Motor Output Torque (mNm)
0615 0,17
1219 0,72
1516 0,59
1624 2
2230 4,7
2233 5,9
0816 0,7
1016 0,92
1024 1,6
1224 1,7
1319 1,4
1331 3,8
1516 0,97
1524 2,9
1717 2,2
1724 4,5
2224 6,8
2232 10
1336 3,6
1727 4,9
1741 8,8
2237 12
2642 26
I use the code:
MergeDat=MergeDat.merge(Motor,how="left")
print(MergeDat)
where MergeDat= df1 and Motor= df2
As result it returns:
Motor Output Torque (mNm)
0 2232 NaN
1 1524 NaN
2 2230 NaN
3 2230 NaN
4 2224 NaN
5 1516 NaN
6 1724 NaN
7 2224 NaN
8 1524 NaN
9 1624 NaN
10 1724 NaN
11 2224 NaN
12 2224 NaN
13 1524 NaN
14 1524 NaN
15 1516 NaN
16 1524 NaN
17 2224 NaN
18 1624 NaN
19 1724 NaN
20 1724 NaN
21 2224 NaN
22 2224 NaN
23 1524 NaN
24 1724 NaN
25 1841 NaN
26 2224 NaN
I have no idea why the Output Torque column is not merged...
Appreciate any help!
Upvotes: 31
Views: 85980
Reputation: 1352
For me I thought I had the same index but but there was a trailing space in the index for the first dataframe. Removing/trimming that helped.
Upvotes: 1
Reputation: 6761
Having some NaN
's in the key column(s) is the usual culprit from my experience. Try at least the 2nd of these 3 lines on both df
's (where unique_id
is the key column used for merging) and see if it helps:
print(df[unique_id].duplicated().sum())
df.drop_duplicates(subset=unique_id, inplace=True)
assert(df[unique_id].duplicated().sum() == 0)
Upvotes: 0
Reputation: 3112
In my case, it was because I haven't reset the index after splitting the data frame, using df.reset_index(drop=True)
. Resetting the index of the first data frame enabled merging a second data frame to it.
Upvotes: 3
Reputation: 862581
You need same dtype
of joined columns:
#convert first or second to str or int
MergeDat['Motor'] = MergeDat['Motor'].astype(str)
#Motor['Motor'] = Motor['Motor'].astype(str)
#MergeDat['Motor'] = MergeDat['Motor'].astype(int)
Motor['Motor'] = Motor['Motor'].astype(int)
#convert first or second to str or int
#MergeDat['Motor'] = MergeDat['Motor'].astype(str)
Motor['Motor'] = Motor['Motor'].astype(str)
MergeDat['Motor'] = MergeDat['Motor'].astype(int)
#Motor['Motor'] = Motor['Motor'].astype(int)
MergeDat=MergeDat.merge(Motor,how="left")
Upvotes: 48