Reputation: 145
So the problem is the following. I have dataframe:
a a b a b
0 0 1 2 1 2
1 3 4 5 4 5
For each column name, I want to remove it's duplicate columns. It is difficult to explain. The resulting dataframe should be:
a a b
0 0 1 2
1 3 4 5
I have achived with drop_duplicates() with the transpose of df[['column_namee']] for each column, but its too slow.
I am wondering if there is any fastest way to solve it.
Upvotes: 0
Views: 621
Reputation: 323396
IIUC
df=df.loc[:,~(df.T.duplicated()&df.columns.duplicated())]
Out[184]:
a a b
0 0 1 2
1 3 4 5
Upvotes: 2