Newbielp
Newbielp

Reputation: 532

Merging two dataframes on the same type column gives me wrong result

I have two dataframes, assume A and B, which have been created after reading the sheets of an Excel file and performing some basic functions. I need to merge right the two dataframes on a column named ID which has first been converted to astype(str) for both dataframes.

The ID column of the left Dataframe (A) is:

0        5815518813016
1        5835503994014
2        5835504934023
3        5845535359006
4        5865520960012
5        5865532845006
6        5875531550008
7        5885498289039
8     5885498289039_A2
9     5885498289039_A3
10    5885498289039_X2
11    5885498289039_X3
12       5885509768698
13       5885522349999
14       5895507791025
Name: ID, dtype: object

The ID column of the right Dataframe (B) is:

0        5835503994014
1        5845535359006
2        5835504934023
3        5815518813016
4     5885498289039_A1
5     5885498289039_A2
6     5885498289039_A3
7     5885498289039_X1
8     5885498289039_X2
9     5885498289039_X3
10       5885498289039
11       5865532845006
12       5875531550008
13       5865520960012
14       5885522349998
15       5895507791025
16       5885509768698
Name: ID, dtype: object

However, when I merge the two, the rest of the columns of the left (A) dataframe become "empty" (np.nan) except for the rows where the ID does not contain only numbers but letters too. This is the pd.merge() I do:

A_B=A.merge(B[['ID','col_B']], left_on='ID', right_on='ID', how='right')

Do you have any ideas what might be so wrong? Your input is valuable.

Upvotes: 0

Views: 473

Answers (1)

Hadas Arik
Hadas Arik

Reputation: 58

Try turning all values in both columns into strings: A['ID'] = A['ID'].astype(str) B['ID'] = B['ID'].astype(str)

Generally, when a merge like this doesn't work, I would try to debug by printing out the unique values in each column to check if anything pops out (usually dtype issues).

Upvotes: 1

Related Questions