Georg Heiler
Georg Heiler

Reputation: 17694

pandas left join - why more results?

How is it possible that a pandas left join like

df.merge(df2, left_on='first', right_on='second', how='left')

increases the data frame from 221309 to 1388680 rows?

edit

shape of df 1 (221309, 83)

shape of df2 (7602, 6)

Upvotes: 3

Views: 4566

Answers (1)

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210882

As @JonClements has already said in the comment it's a result of duplicated entries in the columns used for merging/joining. Here is a small demo:

In [5]: df
Out[5]:
   a   b
0  1  11
1  1  12
2  2  21

In [6]: df2
Out[6]:
   a    c
0  1  111
1  1  112
2  2  221
3  2  222
4  3  311

In [7]: df.merge(df2, on='a', how='left')
Out[7]:
   a   b    c
0  1  11  111
1  1  11  112
2  1  12  111
3  1  12  112
4  2  21  221
5  2  21  222

Upvotes: 6

Related Questions