Reputation: 1133
The following changes my ids to float
, and I've been trying to cast that column back to int
since the try_cast
parameter isn't working out.
(df1.merge(df2, on='id', how='left', indicator=True)
.where(lambda x: x._merge=='left_only', try_cast=True)
.get(['id'])
.dropna()
)
In the past I would set it up this way:
merged = df1.merge(df2, on='id', how='left', indicator=True)
merged['id'] = merged['id'].astype(int)
merged[merged['_merge']=='left_only']
I'm new to Python, and I've been exploring chaining operators to speed up my exploration in jupyter notebooks.
where
method the best method to use? It feels wrong to dropna
in order to filter the results I want.Upvotes: 1
Views: 65
Reputation: 402813
"Is the where
method the best method to use?" Actually, we can do better with query
.
Also, from your comments and my understanding of df.where
, I believe the integers are actually upcasted to floats by this function, so taking that out of the equation will mean further casting is no longer required.
(df1.merge(df2, on='id', how='left', indicator=True)
.query('_merge == "left_only"'))
Upvotes: 1