GoneAsync
GoneAsync

Reputation: 421

python xlsxwriter won't align dates

Dates wind up in the bottom right corner of a cell, apparently ignoring alignment settings from xlsxwriter. A MWE:

import pandas

df = pandas.DataFrame(
    {
        "ints": [1, 2, 3]
        , 'primes': [2, 3, 5]
        , 'odds': [1, 3, 5]
        , 'fechas': ['2017-04-07', '2017-05-09', '2017-11-30'] 
    }
)

df['fechas'] = pandas.to_datetime(df['fechas']).dt.date

print(df)

xlsx_writer = pandas.ExcelWriter(
    'test.xlsx'
    , engine='xlsxwriter'
    , date_format='mm/dd/yyyy'
)

df.to_excel(xlsx_writer, sheet_name='Sheet1', index=False)
wb = xlsx_writer.book
ws = xlsx_writer.sheets['Sheet1']


dollar_format = '_($* #,##0.00" "_);_($* (#,##0.0);_($* "-"??_);_(@_)'
dollar_format_wb = wb.add_format({'num_format': dollar_format, 'valign': 'vcenter'})
centre_format_wb = wb.add_format({'align': 'center', 'valign': 'vcenter'})

ws.set_column('A:A', 25, centre_format_wb)
ws.set_column('B:B', 20, centre_format_wb)
ws.set_column('C:C', 15, centre_format_wb)
ws.set_column('D:D', 10, dollar_format_wb)

# The code below was included to generate the screenshot, but isn't
# strictly necessary for the MWE
shadedrow_format_wb = wb.add_format(
    {
        'bg_color': '#EEEEEE'
        , 'left': 1
        , 'left_color': '#FFFFFF'
    }
)

for r in range(0, 2 + df.shape[0]):
    ws.set_row(r + 1, 45)
    print(r)
    if r % 2 == 0:
        # a kludge as we can't change cell range formats after the fact without re-entering cell contents
        ws.conditional_format('A{:}:D{:}'.format(r, r), {'type': 'no_errors', 'format': shadedrow_format_wb})
        print("\t", r)

xlsx_writer.save()

Columns A, B, and C should be centered horizontally, and all rows except the header should be height 45, with all cell contents centered vertically.

excel screen shot

Everything works like it should, except for the date column. Is there a good reason that this shouldn't work? Is there a correct way of getting the date centred? Is it a bug? Is there a work-around?

I have also tried formatting the sheet first, and performing the df.to_excel() last, with no effect.

Many thanks!

Upvotes: 6

Views: 3681

Answers (3)

törzsmókus
törzsmókus

Reputation: 2003

Just store the date as string in the dataframe: do not convert it to date, or if it is already in date format, convert it back, e.g. using strftime('%Y-%m-%d').

This way, you can export it to excel as text, and then set its cell_format as date with num_format:

…
df.to_excel(xlsx_writer, sheet_name='Sheet1', index=False)
wb = xlsx_writer.book
ws = xlsx_writer.sheets['Sheet1']
…
centre_date_format_wb = wb.add_format({
    'align': 'center',
    'valign': 'vcenter',
    'num_format' : 'mm/dd/yyyy'
})
ws.set_column('A:A', width=25, cell_format=centre_date_format_wb)
…

Upvotes: 0

patrickjlong1
patrickjlong1

Reputation: 3823

I've provided an example of how you can achieve your desired date formatting in a .xlsx output using pandas. It would also require adding the datetime module.

As @jmcnamara mentioned, I think the best and most flexible solution would be to use xlsxwriter directly.

Here's a link to another SO answer that provides further background on the excel "serial date" format and getting it from a datetime object in python. This is essentially the same as what I did to convert the column to an excel date. I've also added an additional format (called "centre_date_format_wb").

Here's the full code with my additions/changes:

import pandas
import datetime

df = pandas.DataFrame(
    {
        "ints": [1, 2, 3]
        , 'primes': [2, 3, 5]
        , 'odds': [1, 3, 5]
        , 'fechas': ['2017-04-07', '2017-05-09', '2017-11-30'] 
    }
)

df['fechas'] = pandas.to_datetime(df['fechas']).dt.date


excel_start_date = datetime.date(1899, 12, 30)
df['fechas'] = df['fechas'] - excel_start_date
df.fechas = df.fechas.dt.days

print(df)

xlsx_writer = pandas.ExcelWriter(
    'test.xlsx'
    , engine='xlsxwriter'
)

df.to_excel(xlsx_writer, sheet_name='Sheet1', index=False)
wb = xlsx_writer.book
ws = xlsx_writer.sheets['Sheet1']


dollar_format = '_($* #,##0.00" "_);_($* (#,##0.0);_($* "-"??_);_(@_)'
dollar_format_wb = wb.add_format({'num_format': dollar_format, 'valign': 'vcenter'})
centre_format_wb = wb.add_format({'align': 'center', 'valign': 'vcenter'})
#additional format added below
centre_date_format_wb = wb.add_format({'align': 'center', 'valign': 'vcenter', 'num_format' : 'mm/dd/yyyy' })

ws.set_column('A:A', 25, centre_date_format_wb)
ws.set_column('B:B', 20, centre_format_wb)
ws.set_column('C:C', 15, centre_format_wb)
ws.set_column('D:D', 10, dollar_format_wb)

# The code below was included to generate the screenshot, but isn't
# strictly necessary for the MWE
shadedrow_format_wb = wb.add_format(
    {
        'bg_color': '#EEEEEE'
        , 'left': 1
        , 'left_color': '#FFFFFF'
    }
)

for r in range(0, 2 + df.shape[0]):
    ws.set_row(r + 1, 45)
    print(r)
    if r % 2 == 0:
        # a kludge as we can't change cell range formats after the fact without re-entering cell contents
        ws.conditional_format('A{:}:D{:}'.format(r, r), {'type': 'no_errors', 'format': shadedrow_format_wb})
        print("\t", r)

xlsx_writer.save()

And an image of the resulting worksheet:

Solution image

Upvotes: 5

jmcnamara
jmcnamara

Reputation: 41554

The issue is that Pandas is applying a cell format for the date and this overrides the column format.

The only way to workaround it would be iterate over the dataframe and write to XlsxWriter directly, applying all the formats that you want.

Upvotes: 5

Related Questions