Reputation: 421
Dates wind up in the bottom right corner of a cell, apparently ignoring alignment settings from xlsxwriter. A MWE:
import pandas
df = pandas.DataFrame(
{
"ints": [1, 2, 3]
, 'primes': [2, 3, 5]
, 'odds': [1, 3, 5]
, 'fechas': ['2017-04-07', '2017-05-09', '2017-11-30']
}
)
df['fechas'] = pandas.to_datetime(df['fechas']).dt.date
print(df)
xlsx_writer = pandas.ExcelWriter(
'test.xlsx'
, engine='xlsxwriter'
, date_format='mm/dd/yyyy'
)
df.to_excel(xlsx_writer, sheet_name='Sheet1', index=False)
wb = xlsx_writer.book
ws = xlsx_writer.sheets['Sheet1']
dollar_format = '_($* #,##0.00" "_);_($* (#,##0.0);_($* "-"??_);_(@_)'
dollar_format_wb = wb.add_format({'num_format': dollar_format, 'valign': 'vcenter'})
centre_format_wb = wb.add_format({'align': 'center', 'valign': 'vcenter'})
ws.set_column('A:A', 25, centre_format_wb)
ws.set_column('B:B', 20, centre_format_wb)
ws.set_column('C:C', 15, centre_format_wb)
ws.set_column('D:D', 10, dollar_format_wb)
# The code below was included to generate the screenshot, but isn't
# strictly necessary for the MWE
shadedrow_format_wb = wb.add_format(
{
'bg_color': '#EEEEEE'
, 'left': 1
, 'left_color': '#FFFFFF'
}
)
for r in range(0, 2 + df.shape[0]):
ws.set_row(r + 1, 45)
print(r)
if r % 2 == 0:
# a kludge as we can't change cell range formats after the fact without re-entering cell contents
ws.conditional_format('A{:}:D{:}'.format(r, r), {'type': 'no_errors', 'format': shadedrow_format_wb})
print("\t", r)
xlsx_writer.save()
Columns A, B, and C should be centered horizontally, and all rows except the header should be height 45, with all cell contents centered vertically.
Everything works like it should, except for the date column. Is there a good reason that this shouldn't work? Is there a correct way of getting the date centred? Is it a bug? Is there a work-around?
I have also tried formatting the sheet first, and performing the df.to_excel()
last, with no effect.
Many thanks!
Upvotes: 6
Views: 3681
Reputation: 2003
Just store the date as string in the dataframe: do not convert it to date, or if it is already in date format, convert it back, e.g. using strftime('%Y-%m-%d')
.
This way, you can export it to excel as text, and then set its cell_format as date with num_format
:
…
df.to_excel(xlsx_writer, sheet_name='Sheet1', index=False)
wb = xlsx_writer.book
ws = xlsx_writer.sheets['Sheet1']
…
centre_date_format_wb = wb.add_format({
'align': 'center',
'valign': 'vcenter',
'num_format' : 'mm/dd/yyyy'
})
ws.set_column('A:A', width=25, cell_format=centre_date_format_wb)
…
Upvotes: 0
Reputation: 3823
I've provided an example of how you can achieve your desired date formatting in a .xlsx output using pandas. It would also require adding the datetime module.
As @jmcnamara mentioned, I think the best and most flexible solution would be to use xlsxwriter directly.
Here's a link to another SO answer that provides further background on the excel "serial date" format and getting it from a datetime object in python. This is essentially the same as what I did to convert the column to an excel date. I've also added an additional format (called "centre_date_format_wb").
Here's the full code with my additions/changes:
import pandas
import datetime
df = pandas.DataFrame(
{
"ints": [1, 2, 3]
, 'primes': [2, 3, 5]
, 'odds': [1, 3, 5]
, 'fechas': ['2017-04-07', '2017-05-09', '2017-11-30']
}
)
df['fechas'] = pandas.to_datetime(df['fechas']).dt.date
excel_start_date = datetime.date(1899, 12, 30)
df['fechas'] = df['fechas'] - excel_start_date
df.fechas = df.fechas.dt.days
print(df)
xlsx_writer = pandas.ExcelWriter(
'test.xlsx'
, engine='xlsxwriter'
)
df.to_excel(xlsx_writer, sheet_name='Sheet1', index=False)
wb = xlsx_writer.book
ws = xlsx_writer.sheets['Sheet1']
dollar_format = '_($* #,##0.00" "_);_($* (#,##0.0);_($* "-"??_);_(@_)'
dollar_format_wb = wb.add_format({'num_format': dollar_format, 'valign': 'vcenter'})
centre_format_wb = wb.add_format({'align': 'center', 'valign': 'vcenter'})
#additional format added below
centre_date_format_wb = wb.add_format({'align': 'center', 'valign': 'vcenter', 'num_format' : 'mm/dd/yyyy' })
ws.set_column('A:A', 25, centre_date_format_wb)
ws.set_column('B:B', 20, centre_format_wb)
ws.set_column('C:C', 15, centre_format_wb)
ws.set_column('D:D', 10, dollar_format_wb)
# The code below was included to generate the screenshot, but isn't
# strictly necessary for the MWE
shadedrow_format_wb = wb.add_format(
{
'bg_color': '#EEEEEE'
, 'left': 1
, 'left_color': '#FFFFFF'
}
)
for r in range(0, 2 + df.shape[0]):
ws.set_row(r + 1, 45)
print(r)
if r % 2 == 0:
# a kludge as we can't change cell range formats after the fact without re-entering cell contents
ws.conditional_format('A{:}:D{:}'.format(r, r), {'type': 'no_errors', 'format': shadedrow_format_wb})
print("\t", r)
xlsx_writer.save()
And an image of the resulting worksheet:
Upvotes: 5
Reputation: 41554
The issue is that Pandas is applying a cell format for the date and this overrides the column format.
The only way to workaround it would be iterate over the dataframe and write to XlsxWriter directly, applying all the formats that you want.
Upvotes: 5