selereth
selereth

Reputation: 57

Python - XlsxWriter: Referring to columns by column name, not column letter (Hardcode vs. dynamic variable)

I'm wondering if there's a way to refer to columns when using XlsxWriter by their name, and not the column letter or index, in the event that the column index changes but I still want the formatting to apply to the same column, even if its position changes.

I am also exporting the contents of a data frame to an Excel file beforehand so if there's a way to reference the column using Pandas and then assign that reference to a variable that's invoked where the column reference in the XlsxWriter method goes then that could work too.

Instead of this:

df.to_excel(writer, sheet_name='RCM_Output',index = False)

worksheet.data_validation(A1:A500, {'validate': 'list',
                                  'source': ['blocked', 'unblocked']})

It it possible to write some sort of dynamic variable in place of the hardcoded cell references?

df.to_excel(writer, sheet_name='RCM_Output',index = False)

worksheet.data_validation('Block Status':1:500, {'validate': 'list',
                                  'source': ['blocked', 'unblocked']})

Upvotes: 2

Views: 1806

Answers (2)

jmcnamara
jmcnamara

Reputation: 41644

The best way to do this is to use the row/col notation form of data_validation() and use Pandas to get the index of the column and the dimensions of the dataframe. Something like this:

import pandas as pd


# Create a Pandas dataframe from some data.
df = pd.DataFrame(
    {
        "Data 1": [10, 20, 30, 20],
        "Data 2": [10, 20, 30, 20],
        "Block Status": ["unblocked", "blocked", "unblocked", "unblocked"],
        "Data 3": [10, 20, 30, 20],
        "Data 4": [10, 20, 30, 20],
    }
)

# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter("pandas_example.xlsx", engine="xlsxwriter")

# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name="Sheet1")

# Get the xlsxwriter workbook and worksheet objects.
workbook = writer.book
worksheet = writer.sheets["Sheet1"]

# Get the max row from the dimensions of the dataframe.
row_num = df.shape[0]

# Get the column index from the name. We add 1 to account for the index.
# If the index is turned off then this can be omitted.
col_num = 1 + df.columns.get_loc("Block Status")

# Add the data validation based on these dimensions.
worksheet.data_validation(
    1, col_num, row_num, col_num,
    {
        "validate": "list",
        "source": ["blocked", "unblocked"]
    },
)

# Close the Pandas Excel writer and output the Excel file.
writer.close()

Output:

enter image description here

Upvotes: 2

Marcelo Paco
Marcelo Paco

Reputation: 2934

It it NOT possible to write some sort of dynamic variable in place of the hardcoded cell references. This is what the documentation has to say when it comes to Cell Notation:

XlsxWriter supports two forms of notation to designate the position of cells: Row-column notation and A1 notation.

Row-column notation uses a zero based index for both row and column while A1 notation uses the standard Excel alphanumeric sequence of column letter and 1-based row.

However, since the A1 notation is technically a string you can do something like this:

block_status = 'A'
df.to_excel(writer, sheet_name='RCM_Output',index = False)

worksheet.data_validation(f'{block_status}1:{block_status}500', {'validate': 'list',
                                  'source': ['blocked', 'unblocked']})

But this is rather cumbersome, since you have to have a variable for each column in your csv.

Upvotes: 2

Related Questions