Reputation: 943
I am having an issue merging two frames with a different amount of rows. The first dataframe has 5K rows, and the second dataframe has 20K rows. There is a column "id" in both frames, and all 5K "id" values will occur in the frame with 20K rows.
first frame "df"
A B id A_1 B_1
0 1 1 1 0.5 0.5
1 3 2 2 0.2 0.4
2 3 4 3 0.8 0.9
second frame "df_2"
A B id
0 1 1 1
1 3 2 2
2 3 4 3
3 1 2 4
4 3 1 5
Hopeful output frame "df_out"
A B id A_1 B_1
0 1 1 1 0.5 0.5
1 3 2 2 0.2 0.4
2 3 4 3 0.8 0.9
3 1 2 4 na na
4 3 1 5 na na
My attempts to merge on 'id' have left me with only the 5k rows. The operation I am seeking is to preserve all the rows of the large dataframe, and stick Nan values for the data that does not exist in the large frame.
Thanks
Upvotes: 1
Views: 123