Reputation: 17694
How is it possible that a pandas left join like
df.merge(df2, left_on='first', right_on='second', how='left')
increases the data frame from 221309 to 1388680 rows?
shape of df 1 (221309, 83)
shape of df2 (7602, 6)
Upvotes: 3
Views: 4566
Reputation: 210882
As @JonClements has already said in the comment it's a result of duplicated entries in the columns used for merging/joining. Here is a small demo:
In [5]: df
Out[5]:
a b
0 1 11
1 1 12
2 2 21
In [6]: df2
Out[6]:
a c
0 1 111
1 1 112
2 2 221
3 2 222
4 3 311
In [7]: df.merge(df2, on='a', how='left')
Out[7]:
a b c
0 1 11 111
1 1 11 112
2 1 12 111
3 1 12 112
4 2 21 221
5 2 21 222
Upvotes: 6