aabujamra
aabujamra

Reputation: 4636

Merge avoiding duplicate columns but keeping only one duplicate

This is a follow-up on this question

I have two dataframes that I want to merge, but I want to avoid to have duplicate columns, so I'm doing:

cols_to_use = df2.columns-df1.columns

If I print cols_to_use I get this:

 Index([col1,col2,col3...],dtype=object)

However, I have one column that I need it to be kept in both dfs, it is the co_code. That's because I'm going to merge on that column.

My question is: how to add one extra column to cols_to_use? I need it to look like this:

Index([co_code,col1,col2,col3...],dtype=object)

I tried different synthaxes but nothing seemed to work:

cols_to_use = df2.columns-df1.columns+'co_code'
cols_to_use = df2.columns-df1.columns+['co_code']
cols_to_use = df2.columns-df1.columns+df2['co_code'].columns

Upvotes: 2

Views: 132

Answers (2)

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210832

Similar to @COLDSPEED's solution:

cols_to_use = df2.columns.difference(df1.columns.drop('co_code'))

Upvotes: 2

cs95
cs95

Reputation: 402263

cols_to_use = df2.columns - df1.columns.difference(['co_code'])

Or,

cols_to_use = (df2.columns - df1.columns).tolist() + ['co_code']

Upvotes: 2

Related Questions