King
King

Reputation: 962

Xlsxwriter - Trouble formatting pandas dataframe cells using xlsxwriter

I have an excel sheet which came from a pandas dataframe. I then use Xlsxwriter to add formulas, new columns and formatting. The problem is I only seem to be able format what I've written using xlsxwriter and nothing that came from the dataframe. So what I get is something like this half formatted table

As you can see from the image the two columns from the dataframe remain untouched. They must have some kind of default formatting that is overriding mine.

Since I don't know how to convert a worksheet back into to a dataframe the code below is obviously completely wrong but it's just to give an idea of what I'm looking for.

export = "files/sharepointExtract.xlsx"
df = pd.read_excel(export)# df = dataframe

writer = pd.ExcelWriter('files/new_report-%s.xlsx' % (date.today()), engine = 'xlsxwriter')
workbook = writer.book

# Code to make the header red, this works fine because 
# it's written in xlsxwriter using write.row()

colour_format = workbook.add_format()
colour_format.set_bg_color('#640000')
colour_format.set_font_color('white')
worksheet.set_row(0, 15, colour_format)

table_body_format = workbook.add_format()
table_body_format.set_bg_color('blue')

for row in worksheet.rows:
  row.set_row(0,15, table_body_format)

This code gives an Attribute error but even without the for loop we just get what can be seen in the image.

Upvotes: 3

Views: 4597

Answers (1)

Martin Evans
Martin Evans

Reputation: 46779

The following should work:

import pandas as pd
from datetime import date

export = "files/sharepointExtract.xlsx"
df = pd.read_excel(export)

writer = pd.ExcelWriter('files/new_report-{}.xlsx'.format(date.today()), engine ='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1', startrow=1 , startcol=0, header=False, index=False, encoding='utf8')  

workbook = writer.book
worksheet = writer.sheets['Sheet1']

# Code to make the header red background with white text
colour_format = workbook.add_format()
colour_format.set_bg_color('#640000')
colour_format.set_font_color('white')

# Code to make the body blue    
table_body_format = workbook.add_format()
table_body_format.set_bg_color('blue')

# Set the header (row 0) to height 15 using colour_format
worksheet.set_row(0, 15, colour_format)
# Set the default format for other rows
worksheet.set_column('A:Z', 15, table_body_format)

# Write the header manually
for colx, value in enumerate(df.columns.values):
    worksheet.write(0, colx, value)
  
writer.save()

When Pandas is used to write the header, it uses its own format style which overwrites the underlying xlsxwriter version. The simplest approach is to stop it from writing the header and get it to write the rest of the data from row 1 onwards (not 0). This avoids the formatting from being altered. You can then easily write your own header using the column values from the dataframe.

Upvotes: 2

Related Questions