Reputation: 443
I have a df with these columns:
Index(['Instrument', 'Date', 'Return on Invst Cap', 'Date',
'Book Value Per Share, Total Equity', 'Date',
'Earnings Per Share Reported - Actual', 'Date',
'Revenue from Business Activities - Total', 'Date',
'Free Cash Flow - Actual', 'Date', 'Total Long Term Debt', 'Date',
'Profit/(Loss) - Starting Line - Cash Flow'],
dtype='object')
There are several columns called 'Date', some of these columns have the same values, some don't.
I would like to only keep the first "Date" column and drop the rest. I think one important step is to change the first "Date" to a different name for example to "1 Date" and drop the other "Date" column
But I failed to rename just this column. For example I tried df_big5_simplified= df_big5.rename(columns={1: '1 Date'})
to try to rename by column index position
but the generated df is exactly the same...
I also tried this apparoach:
columns=pd.Index(['Date', 'Instrument', 'Return on Invst Cap',
'Book Value Per Share, Total Equity',
'Earnings Per Share Reported - Actual',
'Revenue from Business Activities - Total', 'Free Cash Flow - Actual',
'Total Long Term Debt', 'Profit/(Loss) - Starting Line - Cash Flow'], name='item')
df_big5_simplifed=df_big5.reindex(columns=columns)
then I had this error:
ValueError: cannot reindex from a duplicate axis
Any ideas? I could have 50 columns called the same and only want to keep the first one.
Upvotes: 0
Views: 41
Reputation: 6574
You can set all the columns names:
df = df.set_axis(['Instrument', 'Date', 'Return on Invst Cap', 'Date2',
'Book Value Per Share, Total Equity', 'Date3',
'Earnings Per Share Reported - Actual', 'Date4',
'Revenue from Business Activities - Total', 'Date5',
'Free Cash Flow - Actual', 'Date6', 'Total Long Term Debt', 'Date7',
'Profit/(Loss) - Starting Line - Cash Flow'], axis=1, inplace=False)
Upvotes: 1