Sinha
Sinha

Reputation: 456

Pandas xlsxwriter to write dataframe to excel and implementing column-width and border related formatting

Background:

I am using Pandas and have a dataframe 'df' which I intend to write into an Excel sheet. I use the code below and get the output Excel sheet as shown in attached snapshot 'Present.JPG':Present.JPG

import pandas as pd
import xlsxwriter

writer = pd.ExcelWriter('output.xlsx', engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1')
writer.save()

Problem description: I would like to write the dataframe to Excel and incorporate the following changes.

1) Get rid of the first column indicating the index
2) Implement text wrapping on all columns (to auto-size each column width)
3) Sketch thick border A1 to C4, D1 to F4 and column G

Eventually, I would like the Excel sheet to look like as shown in snapshot 'Desired.JPG':Desired.JPG

Tried till now: I tried the following commands but they over-write the border on to the content of the cells. Furthermore, I am not able to figure out how to extend the border (and text wrapping) beyond a single cell.

writer = pd.ExcelWriter("output.xlsx", engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1')
workbook=writer.book
worksheet= writer.sheets['Sheet1']

full_border = workbook.add_format({"border":1,"border_color": "#000000"})
link_format = workbook.add_format({'text_wrap': True})

worksheet.write("D3", None, full_border)
worksheet.write("E1", None, link_format)

writer.save()

Upvotes: 9

Views: 11015

Answers (1)

Clade
Clade

Reputation: 986

I'm a little late to the party but here is what you were looking for:

import xlsxwriter
import pandas as pd

df = pd.DataFrame({
    'Class': ['A', 'A', 'A'],
    'Type': ['Mary', 'John', 'Michael'],
    'JoinDate YYYY-MM-DD': ['2018-12-12', '2018-12-12', '2018-12-15'],
    'Weight': [150, 139, 162],
    'Height': [166.4, 160, 143],
    'Marks': [96, 89, 71],
    'LastDate YYYY-MM-DD': ['2020-01-17', '2020-01-17', '2020-01-17']
})

with pd.ExcelWriter('output.xlsx', engine='xlsxwriter') as writer:
    # remove the index by setting the kwarg 'index' to False
    df.to_excel(excel_writer=writer, sheet_name='Sheet1', index=False)

    workbook = writer.book
    worksheet = writer.sheets['Sheet1']

    # dynamically set column width
    for i, col in enumerate(df.columns):
        column_len = max(df[col].astype(str).str.len().max(), len(col) + 2)
        worksheet.set_column(i, i, column_len)

    # wrap the text in all cells
    wrap_format = workbook.add_format({'text_wrap': True, 'align': 'center'})
    worksheet.set_column(0, len(df.columns) - 1, cell_format=wrap_format)

    # mimic the default pandas header format for use later
    hdr_fmt = workbook.add_format({
        'bold': True,
        'border': 1,
        'text_wrap': True,
        'align': 'center'
    })

    def update_format(curr_frmt, new_prprty, wrkbk):
        """
        Update a cell's existing format with new properties
        """
        new_frmt = curr_frmt.__dict__.copy()

        for k, v in new_prprty.items():
            new_frmt[k] = v

        new_frmt = {
            k: v
            for k, v in new_frmt.items()
            if (v != 0) and (v is not None) and (v != {}) and (k != 'escapes')
        }

        return wrkbk.add_format(new_frmt)

    # create new border formats
    header_right_thick = update_format(hdr_fmt, {'right': 2}, workbook)
    normal_right_thick = update_format(wrap_format, {'right': 2}, workbook)
    normal_bottom_thick = update_format(wrap_format, {'bottom': 2}, workbook)
    normal_corner_thick = update_format(wrap_format, {
        'right': 2,
        'bottom': 2
    }, workbook)

    # list the 0-based indices where you want bold vertical border lines
    vert_indices = [2, 5, 6]

    # create vertical bold border lines
    for i in vert_indices:
        # header vertical bold line
        worksheet.conditional_format(0, i, 0, i, {
            'type': 'formula',
            'criteria': 'True',
            'format': header_right_thick
        })
        # body vertical bold line
        worksheet.conditional_format(1, i,
                                     len(df.index) - 1, i, {
                                         'type': 'formula',
                                         'criteria': 'True',
                                         'format': normal_right_thick
                                     })
        # bottom corner bold lines
        worksheet.conditional_format(len(df.index), i, len(df.index), i, {
            'type': 'formula',
            'criteria': 'True',
            'format': normal_corner_thick
        })
    # create bottom bold border line
    for i in [i for i in range(len(df.columns) - 1) if i not in vert_indices]:
        worksheet.conditional_format(len(df.index), i, len(df.index), i, {
            'type': 'formula',
            'criteria': 'True',
            'format': normal_bottom_thick
        })

Upvotes: 9

Related Questions