user254087
user254087

Reputation: 51

datetime instead of str in read_excell with pandas

I have a dataset saved in an xls file.
In this dataset there are 4 columns that represent dates, in the format dd/mm/yyyy.
My problem is that when I read it in python using pandas and the function read_excel all the columns are read as string, except one, read as datetime64[ns], also if I specify dtypes={column=str}. Why?

Upvotes: 0

Views: 49

Answers (1)

P Maschhoff
P Maschhoff

Reputation: 186

Dates in Excel are frequently stored as numbers, which allows you to do things like subtract them, even though they might be displayed as human-readable dates like dd/mm/yyyy. Pandas is handily taking those numbers and interpreting them as dates, which lets you deal with them more flexibly.

To turn them into strings, you can use the converters argument of pd.read_excel like so:

df = pd.read_excel(filename, converters={'name_of_date_column': lambda dt: dt.strftime('%d/%m/%Y')})

The strftime method lets you format dates however you like. Specifying a converter for your column lets you apply the function to the data as you read it in.

Upvotes: 1

Related Questions