daiyue
daiyue

Reputation: 7458

pandas df merge avoid duplicate column names

The question is when merge two dfs, and they all have a column called A, then the result will be a df having A_x and A_y, I am wondering how to keep A from one df and discard another one, so that I don't have to rename A_x to A later on after the merge.

Upvotes: 4

Views: 2236

Answers (2)

jezrael
jezrael

Reputation: 863056

It depends if need append columns with duplicated columns names to final merged DataFrame:

...then add suffixes parameter to merge:

print (df1.merge(df2, on='Key', suffixes=('', '_')))

--

... if not use @Scott Boston solution.

Upvotes: 2

Scott Boston
Scott Boston

Reputation: 153480

Just filter your dataframe columns before merging.

df1 = pd.DataFrame({'Key':np.arange(12),'A':np.random.randint(0,100,12),'C':list('ABCD')*3})

df2 = pd.DataFrame({'Key':np.arange(12),'A':np.random.randint(100,1000,12),'C':list('ABCD')*3})

df1.merge(df2[['Key','A']], on='Key')

Output: (Note: C is not duplicated)

    A_x  C  Key  A_y
0    60  A    0  440
1    65  B    1  731
2    76  C    2  596
3    67  D    3  580
4    44  A    4  477
5    51  B    5  524
6     7  C    6  572
7    88  D    7  984
8    70  A    8  862
9    13  B    9  158
10   28  C   10  593
11   63  D   11  177

Upvotes: 2

Related Questions