Reputation: 35
I have a block of code that is able to create dataframes from an excel file (File A), apply formatting, and save the result into another excel file (File B).
Part of the formatting involves the .conditional_format() method of the XlsxWriter package, like so:
worksheet.conditional_format('A1:F80',
{'type':'cell',
'criteria':'==',
'value':'"Folder & Contents:"',
'format': boldfill})
File A can have varying amounts of data, and as a result File B will have varying sizes.
Hence I was wondering if there is any way to apply the conditional_format() method that is able to read the size of the resulting dataframe, and correspondingly apply the formatting until the end of the table, and not over a standard size (like 'A1:F80') as shown in the above code.
Thanks!!!
Upvotes: 1
Views: 1742
Reputation: 41644
Conditional formatting, and almost every other API in Xlsxwriter, take either an A1 notation or (row, col) notation. So the following are equivalent:
worksheet.conditional_format('A1:F80', {...})
worksheet.conditional_format(0, 0, 79, 5, {...})
In your case you could base the max row on the length of the dataframe:
worksheet.conditional_format(0, 0, len(df), 5, {...})
Or something similar. See the docs for a full explanation of the ranges in conditional formats.
Upvotes: 1
Reputation: 7496
I don't think there is a in-built option, so you can define your own function to pass from dataframe to excel range.
Here is what you can do:
Get the row range (easy):
def get_row_range(df, start_row = 0):
# Using python convention that first row is zero
return (str(start_row+1), str(start_row + df.shape[0]))
Get the column range (less easy):
from string import ascii_uppercase
def get_col_range(df, start_col = 0):
return (get_col_str(start_col), get_col_str(start_col + df.shape[1] - 1))
def get_col_str(num):
N = len(ascii_uppercase)
if num/N == 0:
return ascii_uppercase[num]
if num/N > 0:
prefix = ascii_uppercase[num/N - 1]
suffix = ascii_uppercase[num%N]
return prefix + suffix
Put everything together:
def get_df_range(df, start_col = 0, start_row = 0):
col_range = get_col_range(df, start_col)
row_range = get_row_range(df, start_row)
return ":".join([x+y for x, y in zip(col_range, row_range)])
Test:
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.rand(100, 3))
get_df_range(df) # 'A1:C100'
get_df_range(df, start_col = 1, start_row = 2) # 'B3:D102'
Upvotes: 2