Reputation: 31
I want to merge two dataframe on index and want to return only the distinct columns present after merging.
Currently, I am using - pd.merge(X_train, all_data, left_index=True, right_index=True), to merge. But all columns are returned, by appending _x and _y to the end of the column name for identification.
I just need the distinct columns.
Thanks!
Upvotes: 0
Views: 730
Reputation: 1352
You could try to extract the distinct columns before the merge, and then explicitly pass those to the merge command:
X_train_cols = X_train.columns
all_data_cols = all_data.columns
all_data_cols_new = list(set(all_data_cols).difference(X_train_cols))
Then:
pd.merge(X_train, all_data[all_data_cols_new], left_index=True, right_index=True)
Upvotes: 3