Reputation: 414
I want to write multiple df of varying sizes to Excel as my code runs.
Some tables will contain source data, and other tables will contain Excel formulas that operate on that source data.
Rather than tracking the range of cells that I wrote the source data to, I want the formula df to contain an Excel reference to the source data df.
This can be done with Excel's Names or with Excel's Table features.
For example in my formula df I can have =INDEX(my_Defined_Name_source_data, 4,3) * 2 and the Excel Name my_Defined_Name_source_data is all I need to index my source data.
Openpyxl details writing Tables here https://openpyxl.readthedocs.io/en/stable/worksheet_tables.html?highlight=tables
Tables doesn't support the merged cells which a multiindex df.to_excel will create.
So I'm looking at Defined Names instead. There's almost no documentation for writing Defined Names in openpyxl using wb.defined_names.append() This is what I've found https://openpyxl.readthedocs.io/en/stable/api/openpyxl.workbook.defined_name.html?highlight=definednames
What I'm asking for help with: How to write a DataFrame to Excel and also give it an Excel Defined Name. Documentation and online examples are almost non existent.
Also gratefully accepting suggestions for alternative ideas since I seem to be accessing something almost nobody else uses.
Upvotes: 1
Views: 3300
Reputation: 66
The "xlsxwriter" library allows you to create an Excel Data Table, so I wrote the following function to take a DataFrame, write it to Excel, and then transform the data to a Data Table.
def dataframe_to_excel_table(df, xl_file, xl_tablename, xl_sheet='Sheet1'):
"""
Pass a dataframe, filename, name of table and Excel sheet name.
Save an excel file of the df, formatted as a named Excel 'Data table'
* Requires "xlsxwriter" library ($ pip install XlsxWriter)
:param df: a Pandas dataframe object
:param xl_file: File name of Excel file to create
:param xl_sheet: String containing sheet/tab name
:param xl_tablename: Data table name in the excel file
:return: Nothing / New Excel file
"""
# Excel doesn't like multi-indexed df's. Convert to 1 value per column/row
# See https://stackoverflow.com/questions/14507794
df.reset_index(inplace=True) # Expand multiindex
# Write dataframe to Excel
writer = pd.ExcelWriter(path=xl_file,
engine='xlsxwriter',
datetime_format='yyyy mm dd hh:mm:ss')
df.to_excel(writer, index=False, sheet_name=xl_sheet)
# Get dimensions of data to size table
num_rows, num_cols = df.shape
# make list of dictionaries of form [{'header' : col_name},...]
# to pass so table doesn't overwrite column header names
# https://xlsxwriter.readthedocs.io/example_tables.html#ex-tables
dataframes_cols = df.columns.tolist()
col_list = [{'header': col} for col in dataframes_cols]
# Convert data in Excel file to an Excel data table
worksheet = writer.sheets[xl_sheet]
worksheet.add_table(0,0, # begin in Cell 'A1'
num_rows, num_cols-1,
{'name': xl_tablename,
'columns': col_list})
writer.save()
Upvotes: 3
Reputation: 414
I fixed this by simply switching from OpenPyXL to XLSXWriter
https://xlsxwriter.readthedocs.io/example_defined_name.html?highlight=names
Upvotes: 1