Ger Cas
Ger Cas

Reputation: 2298

How to highlight rows based on content in Excel Dataframe?

I have an excel file containing data like the one to left, and I'm trying to format it to get the data formatting as table to the right.

enter image description here

With my current code I'm able to format all rows containig headers (H1, H2,...)

This is the content of file.xlsx:

This is my current code:

import pandas as pd
import numpy as np

from xlsxwriter.utility import xl_rowcol_to_cell

data = {'H1': {0: 'A', 1: '', 2: 'H1', 3: 'A', 4: '', 5: 'H1', 6: 'A', 7: 'A', 8: 'B', 9: 'B', 10: 'B', 11: '', 12: 'H1', 13: 'B', 14: 'B', 15: '', 16: 'H1', 17: 'C', 18: 'C', 19: 'C', 20: 'D', 21: 'D', 22: ''}, 'H2': {0: 'Rty', 1: '', 2: 'H2', 3: 'Rty', 4: '', 5: 'H2', 6: 'Rty', 7: 'Rty', 8: 'Rty', 9: 'Rty', 10: 'Rty', 11: '', 12: 'H2', 13: 'Rty', 14: 'Rty', 15: '', 16: 'H2', 17: 'Rty', 18: 'Rty', 19: 'Rty', 20: 'Rty', 21: 'Rty', 22: ''}, 'H3': {0: '1195', 1: '', 2: 'H3', 3: '1195', 4: '', 5: 'H3', 6: '1195', 7: '1195', 8: '1195', 9: '1195', 10: '1195', 11: '', 12: 'H3', 13: '1195', 14: '1195', 15: '', 16: 'H3', 17: '1195', 18: '1195', 19: '1195', 20: '1195', 21: '1195', 22: ''}, 'H4': {0: '9038', 1: 'H3=9038, 000', 2: 'H4', 3: '1355', 4: 'H3=1355, 363', 5: 'H4', 6: '2022', 7: '2022', 8: '2022', 9: '2022', 10: '2022', 11: 'H3=2022, 234', 12: 'H4', 13: '2564', 14: '2564', 15: 'H3=2564, 726', 16: 'H4', 17: '1501', 18: '1501', 19: '1501', 20: '1501', 21: '1501', 22: 'H3=1501, 143'}, 'H5': {0: '1537', 1: '', 2: 'H5', 3: '8', 4: '', 5: 'H5', 6: '59', 7: '78', 8: '76', 9: '6', 10: '31', 11: '', 12: 'H5', 13: '71', 14: '17', 15: '', 16: 'H5', 17: '72', 18: '89', 19: '47', 20: '32', 21: '233', 22: ''}}
df = pd.DataFrame.from_dict(data)


writer = pd.ExcelWriter('Output.xlsx', engine='xlsxwriter')
df.to_excel(writer, index=False, sheet_name='Output')

workbook = writer.book
worksheet = writer.sheets['Output']
number_rows = len(df.index)

format1 = workbook.add_format({'bg_color': 'black', 'font_color': 'yellow'})

for r in range(0,number_rows):
    if df.iat[r,0] == "H1":
        worksheet.set_row(r+1, None, format1) 

writer.save()

This is my current output:

enter image description here

I'm stuck in how to limit the format from column A to E and in how to intarcalate color green, yellow, green yello depending when values in column A changes. I mean, for all consecutive values in column A = "A" highlight in green, when changes highlight to yellow when changes highlightto green again and so on.

How can I do this? Thanks in advance.

Upvotes: 6

Views: 9101

Answers (2)

JimmyNJ
JimmyNJ

Reputation: 1194

If you want to continue using the xlsxwriter Python module you may use the write(…) method on the worksheet object to set the contents and formatting of a cell in one step.

You will have to break up your to_excel() method and write each DataFrame value individually in a loop.

Example cell creation and formatting call:

cell_format = workbook.add_format({'bold': True, 'italic': True})

# inside a loop iterating over your DataFrame
worksheet.write(row, column, value, cell_format)  # Cell is bold and italic.

Upvotes: 2

Elkana Bardugo
Elkana Bardugo

Reputation: 375

You can do it with different excel library, like openpyxl

You can format each cell separately For example:

from openpyxl import Workbook
from openpyxl.styles import Font, Color, colors, fills
from openpyxl.utils.dataframe import dataframe_to_rows
wb = Workbook()
ws = wb.active

for r in dataframe_to_rows(df, index=False, header=True):
    ws.append(r)

a1 = ws['A1']
a1.font = Font(color="FF0000")
a1.fill = fills.PatternFill(patternType='solid', fgColor=Color(rgb='00FF00'))
wb.save("pandas_openpyxl.xlsx")

They have great documentation here: https://openpyxl.readthedocs.io/en/stable/pandas.html

Upvotes: 10

Related Questions