Reputation: 720
I have the following script that takes an xls file and creates an xlsx with a table (formatted as an actual table). The column headers appear as Column1 Column2 etc within the final table, rather than the actual headers found within the dataframe.
Is there a way to fix this? I also found myself creating a temporary xlsx file, copying it to include the table, and removing the temporary version. Is it possible to streamline this more?
Code:
import os
import shutil
import pandas as pd
import xlsxwriter
import xlrd
from datetime import datetime
date = datetime.today().strftime('%Y-%m-%d')
demand = r"C:\Users\xxxx\Desktop\source.xls"
dfd = pd.read_excel(demand, sheet_name = 'sheet').fillna(0)
dfd = dfd.iloc[6:]
dfd.columns = dfd.iloc[0] #replace headers
dfd = dfd[1:] #remove headers from first row
destd = r"C:\Users\xxxx\Desktop\temporary.xlsx"
destd2 = r"C:\Users\xxxx\Desktop\File (" + str(date) + ").xlsx"
dfd.to_excel(destd)
workbook = xlsxwriter.Workbook(destd)
worksheet = workbook.add_worksheet("Demand")
worksheet.add_table('A1:DL10000', {'data': dfd.values.tolist()})
workbook.close()
destination = shutil.copy2(destd, destd2)
os.remove(destd)
Import Column Headers:
dictonary = r"C:\Users\xxxx\Dictionary - Column Headers.xlsx"
dfd = pd.read_excel(dictonary, sheet_name = 'Demand')
dict1 = dfd.to_dict()
Upvotes: 1
Views: 639
Reputation: 1393
From xlsxwriter's documentation here:
The columns parameter can be used to set properties for columns within the table.
The sub-properties that can be set are header, header_format etc.
The column data must be specified as a list of dicts.
That means that you can add another key 'columnms' in the dictionary of the add_table method to rename the headers.
Here is an example of a table with the headers renamed:
import pandas as pd
df = pd.DataFrame({'Name': ['John','George','Paul'],
'Age': [23,34,42]})
writer = pd.ExcelWriter('test.xlsx', engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1', index=False)
workbook = writer.book
worksheet = writer.sheets['Sheet1']
# This will create a basic table with the headers named as 'Column 1' etc
#worksheet.add_table('A1:B4', {'data': df.values.tolist()})
# Add the 'columns' argument to name the headers
worksheet.add_table('A1:B4', {'data': df.values.tolist(),
'columns': [{'header': 'Name'},
{'header': 'Age'}]})
writer.save()
Output:
EDIT
If your dataframe is imported from a file and you do not want to manually type all the header names you could do 'columns': [{'header': df.columns[0]}, {'header': df.columns[1]}]
etc or even better declare a variable and assign it a list comprehension:
header_names = [{'header': df.columns[x]} for x in range(df.shape[1])]
Then you just need to do:
worksheet.add_table('A1:B4', {'data': df.values.tolist(),
'columns': header_names})
Upvotes: 1