Reputation: 17097
I am reading an excel file into pandas using pd.ExcelFile.
It reads correctly and I can print the dataframe. But when I try to select a subset of columns like:
subdf= origdf[['CUTOMER_ID','ASSET_BAL']]
I get error:
KeyError: "['CUTOMER_ID' 'ASSET_BAL'] not in index"
Do I need to define some kind of index here? When I printed the df, I verified that the columns are there.
Upvotes: 8
Views: 29427
Reputation: 367
And for when you don't have a typo problem, here is a solution:
Use loc
instead,
subdf= origdf.loc[:, ['CUSTOMER_ID','ASSET_BAL']].values
(I'd be glad to learn why this one works, though.)
Upvotes: 0
Reputation: 248
Ensure that the columns actually exist in the dataframe. For example, you have written CUTOMER and not CUSTOMER, which I assume is the correct name.
You can verify the column names by using list(origdf.columns.values)
.
Upvotes: 16