Python - XlsxWriter: Referring to columns by column name, not column letter (Hardcode vs. dynamic variable)

Question

I'm wondering if there's a way to refer to columns when using XlsxWriter by their name, and not the column letter or index, in the event that the column index changes but I still want the formatting to apply to the same column, even if its position changes.

I am also exporting the contents of a data frame to an Excel file beforehand so if there's a way to reference the column using Pandas and then assign that reference to a variable that's invoked where the column reference in the XlsxWriter method goes then that could work too.

Instead of this:

df.to_excel(writer, sheet_name='RCM_Output',index = False)

worksheet.data_validation(A1:A500, {'validate': 'list',
                                  'source': ['blocked', 'unblocked']})

It it possible to write some sort of dynamic variable in place of the hardcoded cell references?

df.to_excel(writer, sheet_name='RCM_Output',index = False)

worksheet.data_validation('Block Status':1:500, {'validate': 'list',
                                  'source': ['blocked', 'unblocked']})

jmcnamara · Accepted Answer

The best way to do this is to use the row/col notation form of data_validation() and use Pandas to get the index of the column and the dimensions of the dataframe. Something like this:

import pandas as pd


# Create a Pandas dataframe from some data.
df = pd.DataFrame(
    {
        "Data 1": [10, 20, 30, 20],
        "Data 2": [10, 20, 30, 20],
        "Block Status": ["unblocked", "blocked", "unblocked", "unblocked"],
        "Data 3": [10, 20, 30, 20],
        "Data 4": [10, 20, 30, 20],
    }
)

# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter("pandas_example.xlsx", engine="xlsxwriter")

# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name="Sheet1")

# Get the xlsxwriter workbook and worksheet objects.
workbook = writer.book
worksheet = writer.sheets["Sheet1"]

# Get the max row from the dimensions of the dataframe.
row_num = df.shape[0]

# Get the column index from the name. We add 1 to account for the index.
# If the index is turned off then this can be omitted.
col_num = 1 + df.columns.get_loc("Block Status")

# Add the data validation based on these dimensions.
worksheet.data_validation(
    1, col_num, row_num, col_num,
    {
        "validate": "list",
        "source": ["blocked", "unblocked"]
    },
)

# Close the Pandas Excel writer and output the Excel file.
writer.close()

Output:

Python - XlsxWriter: Referring to columns by column name, not column letter (Hardcode vs. dynamic variable)

Answers (2)

Related Questions