wolfhoundjesse
wolfhoundjesse

Reputation: 1133

What to do when .where() can't cast data back to original type

The following changes my ids to float, and I've been trying to cast that column back to int since the try_cast parameter isn't working out.

(df1.merge(df2, on='id', how='left', indicator=True)
    .where(lambda x: x._merge=='left_only', try_cast=True)
    .get(['id'])
    .dropna()
)

In the past I would set it up this way:

merged = df1.merge(df2, on='id', how='left', indicator=True)
merged['id'] = merged['id'].astype(int)
merged[merged['_merge']=='left_only']

I'm new to Python, and I've been exploring chaining operators to speed up my exploration in jupyter notebooks.

  1. How would I perform this operation inline?
  2. Is the where method the best method to use? It feels wrong to dropna in order to filter the results I want.

Upvotes: 1

Views: 65

Answers (1)

cs95
cs95

Reputation: 402813

"Is the where method the best method to use?" Actually, we can do better with query.

Also, from your comments and my understanding of df.where, I believe the integers are actually upcasted to floats by this function, so taking that out of the equation will mean further casting is no longer required.

(df1.merge(df2, on='id', how='left', indicator=True)
    .query('_merge == "left_only"'))

Upvotes: 1

Related Questions