Reputation: 35
I have abc.xlsb file having column name "Reporting Date". It is having values "16-Aug-21". I am trying to convert abc.xlsb to a CSV file. But I am getting random value for date "44424"
import pandas as pd
xls = pd.ExcelFile('abc.xlsb')
df = pd.read_excel(xls, 'xyz_sheetname')
print(df['Reporting Date'])
Output :
44424
Expected output :
16-Aug-21
Please help. Thanks in advance.
Upvotes: 0
Views: 843
Reputation: 401
44424 represents the number of days passed since 12/31/1899
For example:
1 = 31 Dec 1899
2 = 1 Jan 1900
3 = 2 Jan 1900
So, 44424 would be exactly 16 Aug 2021 as expected.
You can use this logic on pandas to get the actual date before exporting into csv.
You might want to check this: https://www.ablebits.com/office-addins-blog/2019/08/13/google-sheets-change-date-format/
This should work:
from datetime import datetime, timedelta
import pandas as pd
df = pd.DataFrame({"date": [1,2,3,4, 44424]})
df['date'] = df['date'].apply(lambda x: timedelta(x) + datetime(1899,12,30))
print(df)
Upvotes: 2