Reputation: 2141
I have 2 DataFrames: frame1, and frame2
In [10]: frame1[:5]
Out[10]:
cid
0 531
1 1102
2 1103
3 1406
4 1409
In [14]: frame2[:5]
Out[14]:
cid media_cost imps booked_revenue
0 72692 29.671446 13918 84.961853
1 72704 3121.781201 6992946 9912.982516
2 531 0.001540 2 0.000000
3 39964 2307.119001 3997167 5425.629736
4 72736 45.716847 143574 56.280000
frame1 has 60,888 rows, frame2 has 139,846 rows.
Using these two Dataframes, I want to create a third Dataframe that basically consists of frame2 with all the cid values it shares with frame1 removed. So, in this example, I would want a frame3 that is frame2 without row 2, cid 531, that it shares with frame1.
Upvotes: 1
Views: 766
Reputation: 353179
How about:
>>> f1
cid
0 531
1 1102
2 1103
3 1406
4 1409
>>> f2
cid media_cost imps booked_revenue
0 72692 29.671446 13918 84.961853
1 72704 3121.781201 6992946 9912.982516
2 531 0.001540 2 0.000000
3 39964 2307.119001 3997167 5425.629736
4 72736 45.716847 143574 56.280000
>>> f2[~f2.cid.isin(f1.cid)]
cid media_cost imps booked_revenue
0 72692 29.671446 13918 84.961853
1 72704 3121.781201 6992946 9912.982516
3 39964 2307.119001 3997167 5425.629736
4 72736 45.716847 143574 56.280000
Upvotes: 3