JoNm
JoNm

Reputation: 21

Pandas read_excel returns columns of type object

Using the following code to read in a table from Excel, discard a blank line under the column titles, and rename the Index:

Energy = pd.read_excel('assets/Energy Indicators.xls', header=16, usecols="C:F", index_col=0, nrows=227)
Energy = Energy.iloc[1:,:]
Energy.index.rename('Country',inplace=True)

For an Excel table that looks like this:

Excel table showing header and col names

This successfully returns the dataframe I need with the correct rows and columns. However, Energy.dtypes shows all columns to be objects, and I need the first column containing country names (which I've read as Index) to be type str. I tried Energy.index.astype(str) but still had "object" as the Index type. I tied adding dtype={0:str} to the read_excel call, but still has "object" as the Index type.

What is the correct method to either a) cause read_excel to return my Index as str; or b) convert index type to str?

Upvotes: 2

Views: 615

Answers (1)

j__carlson
j__carlson

Reputation: 1348

You have to define converters within the pd.read function:

Energy = pd.read_excel('assets/Energy Indicators.xls', header=16, usecols="C:F", index_col=0, nrows=227, converters={'my_column_name':str})

Upvotes: 2

Related Questions