2Obe
2Obe

Reputation: 3710

Pandas Merge returns NaN

I have issues with the merging of two large Dataframes since the merge returns NaN values though there are fitting values. The two dfs are shaped like:

df1

Motor
2232
1524
2230
2230
2224
1516
1724
2224
1524
1624
1724
2224
2224
1524
1524
1516
1524
2224
1624
1724
1724
2224
2224

df2

Motor   Output Torque (mNm)
0615    0,17
1219    0,72
1516    0,59
1624    2
2230    4,7
2233    5,9
0816    0,7
1016    0,92
1024    1,6
1224    1,7
1319    1,4
1331    3,8
1516    0,97
1524    2,9
1717    2,2
1724    4,5
2224    6,8
2232    10
1336    3,6
1727    4,9
1741    8,8
2237    12
2642    26

I use the code:

MergeDat=MergeDat.merge(Motor,how="left")
print(MergeDat)

where MergeDat= df1 and Motor= df2

As result it returns:

  Motor  Output Torque (mNm)
0      2232                  NaN
1      1524                  NaN
2      2230                  NaN
3      2230                  NaN
4      2224                  NaN
5      1516                  NaN
6      1724                  NaN
7      2224                  NaN
8      1524                  NaN
9      1624                  NaN
10     1724                  NaN
11     2224                  NaN
12     2224                  NaN
13     1524                  NaN
14     1524                  NaN
15     1516                  NaN
16     1524                  NaN
17     2224                  NaN
18     1624                  NaN
19     1724                  NaN
20     1724                  NaN
21     2224                  NaN
22     2224                  NaN
23     1524                  NaN
24     1724                  NaN
25     1841                  NaN
26     2224                  NaN

I have no idea why the Output Torque column is not merged...

Appreciate any help!

Upvotes: 31

Views: 85980

Answers (4)

Erik Thysell
Erik Thysell

Reputation: 1352

For me I thought I had the same index but but there was a trailing space in the index for the first dataframe. Removing/trimming that helped.

Upvotes: 1

mirekphd
mirekphd

Reputation: 6761

Having some NaN's in the key column(s) is the usual culprit from my experience. Try at least the 2nd of these 3 lines on both df's (where unique_id is the key column used for merging) and see if it helps:

print(df[unique_id].duplicated().sum())
df.drop_duplicates(subset=unique_id, inplace=True)
assert(df[unique_id].duplicated().sum() == 0)

Upvotes: 0

Pavindu
Pavindu

Reputation: 3112

In my case, it was because I haven't reset the index after splitting the data frame, using df.reset_index(drop=True). Resetting the index of the first data frame enabled merging a second data frame to it.

Upvotes: 3

jezrael
jezrael

Reputation: 862581

You need same dtype of joined columns:

#convert first or second to str or int
MergeDat['Motor'] = MergeDat['Motor'].astype(str)
#Motor['Motor'] = Motor['Motor'].astype(str)

#MergeDat['Motor'] = MergeDat['Motor'].astype(int)
Motor['Motor'] = Motor['Motor'].astype(int)

#convert first or second to str or int
#MergeDat['Motor'] = MergeDat['Motor'].astype(str)
Motor['Motor'] = Motor['Motor'].astype(str)

MergeDat['Motor'] = MergeDat['Motor'].astype(int)
#Motor['Motor'] = Motor['Motor'].astype(int)


MergeDat=MergeDat.merge(Motor,how="left")

Upvotes: 48

Related Questions