Danish
Danish

Reputation: 2871

read xlsb file as pandas dataframe and parse the date column as datetime format

I have a 'some.xlsb' file with some 10 columns, out of which 2 are DateTime column.

When I load using pandas the date-time column is parsed in a different form.

Explanations:

where DateTime value corresponding to 4/10/2021 11:50:24 AM - read as 44296.5

Below is the code I tried.

goods_df = pd.read_excel('some.xlsb',
                   engine='pyxlsb', sheet_name='goods_df')

goods_df_header = goods_df.iloc[1]
goods_df.columns = goods_df_header #set the header row as the df header
goods_df= goods_df[2:]
goods_df.head(2)

Upvotes: 4

Views: 4300

Answers (1)

Aditya Sai
Aditya Sai

Reputation: 21

When you read xlsb file using pandas you will get excel time float value because xlsb convert datetime object into an float value before storing.

According to Microsoft 44296.5 means 44296.5 days passed since jan 1st 1900.

You need convert this into epoch and then date by using below formula( epoch value= number of sec passed since jan 1st 1970 00:00:00 ).

a = datetime.datetime.strftime((int(<datevalue from excel>)*86400)-2207520000, "%m/%d/%Y")

Or you can save this xlsb as xlsx and read it you will get exact datetime object.

Upvotes: 2

Related Questions