Reputation: 23
Data in excel:
a b a d
1 2 3 4
2 3 4 5
3 4 5 6
4 5 6 7
Code:
df= pd.io.excel.read_excel(r"sample.xlsx",sheetname="Sheet1")
df
a b a.1 d
0 1 2 3 4
1 2 3 4 5
2 3 4 5 6
3 4 5 6 7
how to delete the column a.1
?
when pandas reads the data from excel it automatically changes the column name of 2nd a to a.1.
I tried df.drop("a.1",index=1)
, this does not work.
I have a huge excel file which has duplicate names, and i am interested only in few of columns.
Upvotes: 2
Views: 1532
Reputation: 169
Much more generally drop all duplicated columns
df= df.drop(df.filter(regex='\.\d').columns, axis=1)
Upvotes: 0
Reputation: 394159
You need to pass axis=1
for drop
to work:
In [100]:
df.drop('a.1', axis=1)
Out[100]:
a b d
0 1 2 4
1 2 3 5
2 3 4 6
3 4 5 7
Or just pass a list of the cols of interest for column selection:
In [102]:
cols = ['a','b','d']
df[cols]
Out[102]:
a b d
0 1 2 4
1 2 3 5
2 3 4 6
3 4 5 7
Also works with 'fancy indexing':
In [103]:
df.ix[:,cols]
Out[103]:
a b d
0 1 2 4
1 2 3 5
2 3 4 6
3 4 5 7
Upvotes: 2
Reputation: 81654
If you know the name of the column you want to drop:
df = df[[col for col in df.columns if col != 'a.1']]
and if you have several columns you want to drop:
columns_to_drop = ['a.1', 'b.1', ... ]
df = df[[col for col in df.columns if col not in columns_to_drop]]
Upvotes: 1