pandas df merge avoid duplicate column names

Question

The question is when merge two dfs, and they all have a column called A, then the result will be a df having A_x and A_y, I am wondering how to keep A from one df and discard another one, so that I don't have to rename A_x to A later on after the merge.

Scott Boston · Accepted Answer

Just filter your dataframe columns before merging.

df1 = pd.DataFrame({'Key':np.arange(12),'A':np.random.randint(0,100,12),'C':list('ABCD')*3})

df2 = pd.DataFrame({'Key':np.arange(12),'A':np.random.randint(100,1000,12),'C':list('ABCD')*3})

df1.merge(df2[['Key','A']], on='Key')

Output: (Note: C is not duplicated)

    A_x  C  Key  A_y
0    60  A    0  440
1    65  B    1  731
2    76  C    2  596
3    67  D    3  580
4    44  A    4  477
5    51  B    5  524
6     7  C    6  572
7    88  D    7  984
8    70  A    8  862
9    13  B    9  158
10   28  C   10  593
11   63  D   11  177

pandas df merge avoid duplicate column names

Answers (2)

Related Questions