Bappa
Bappa

Reputation: 81

Drop duplicate columns

I have a dataframe which can have duplicate columns. columns values are exactly identical. I need to find those all the instances of those duplicates and only keep one instance

Here is what it looks like

    0          1       2    3    4       5          6        7
0   DATE      YEARS  DAYS MONTHS YEAR   DATE        DATE    YEARS
1   1/1/2010    2010    0   1   2010    1/1/2010    1/1/2010    2010
2   1/2/2010    2010    1   1   2010    1/2/2010    1/2/2010    2010
3   1/3/2010    2010    2   1   2010    1/3/2010    1/3/2010    2010
4   1/4/2010    2010    3   1   2010    1/4/2010    1/4/2010    2010
5   1/5/2010    2010    4   1   2010    1/5/2010    1/5/2010    2010
6   1/6/2010    2010    5   1   2010    1/6/2010    1/6/2010    2010
7   1/7/2010    2010    6   1   2010    1/7/2010    1/7/2010    2010

In the above data 'DATE' and 'YEARS' columns are repeating themselves. So I need to get rid of those repeats and just keep one 'DATE' and 'YEARS' column. Final outcome should have only one instance of DATE YEARS MONTHS DAY YEAR

    0            1      2   3   4
0   DATE     YEARS    DAYS  MONTHS  YEAR
1   1/1/2010    2010    0   1   2010
2   1/2/2010    2010    1   1   2010
3   1/3/2010    2010    2   1   2010
4   1/4/2010    2010    3   1   2010
5   1/5/2010    2010    4   1   2010
6   1/6/2010    2010    5   1   2010
7   1/7/2010    2010    6   1   2010

Upvotes: 0

Views: 54

Answers (1)

BENY
BENY

Reputation: 323226

Let us do drop_duplicates

df = df.T.drop_duplicates(0).T

Upvotes: 2

Related Questions