Reputation:
I have several dataframes which I want to merge into only one big dataframe to build a classifier.
This is the base dataframe, user_df_copy
In this dataframe, there is the id column which indicates the client id. I have other dataframes like this one, which have columns related to user_id column.
So, the goal is to merge these small dataframes into the user_df_copy, adding columns like subject_id and to have values only if the user_id matches to the main df id, otherwise, NaN. Problem is, in these small dataframes, the id's appear duplicated.
I also applied get_dummies to the subject_id column like this.
Upvotes: 0
Views: 1731
Reputation: 808
If you want to just drop duplicate rows in the smaller DataFrame
s you can use:
df.drop_duplicates(subset="id")
Upvotes: 1