Reputation: 46381
I've already read Can Pandas read and modify a single Excel file worksheet (tab) without modifying the rest of the file? but here my question is specific to the layout mentioned hereafter.
How to open an Excel file with Pandas, do some modifications, and save it back:
(1) without removing that there is a Filter
on the first row
(2) without modifying the "displayed column width" of the columns as displayed in Excel
(3) without removing the formulas which might be present on some cells
?
Here is what I tried, it's a short example (in reality I do more processing with Pandas):
import pandas as pd
df = pd.read_excel('in.xlsx')
df['AB'] = df['A'].astype(str) + ' ' + df['B'].astype(str) # create a new column from 2 others
del df['Date'] # delete columns
del df['Time']
df.to_excel('out.xlsx', index=False)
With this code, the Filter
of the first row is removed and the displayed column width are set to a default, which is not very handy (because we would have to manually set the correct width for all columns).
Upvotes: 5
Views: 7933
Reputation: 23099
If you are using a machine that has Excel installed on it, then I highly recommend using the flexible xlwings API. This answers all your questions.
Let's assume I have an Excel file called demo.xlxs
in the same directory as my program.
app.py
import xlwings as xw # pip install xlwings
import pandas as pd
wb = xw.Book('demo.xlsx')
This will create a initiate an xl workbook instance and open your Excel editor to allow you to invoke Python commands.
Let's assume we have the following dataframe that we want to use to replace the ID and Name column:
new_name
A John_new
B Adams_new
C Mo_new
D Safia_new
wb.sheets['Sheet1']['A1:B1'].value = df
Finally, you can save and close.
wb.save()
wb.close()
Upvotes: 5
Reputation: 46381
This is a solution for (1), (2), but not (3) from my original question. (If you have an idea for (3), a comment and/or another answer is welcome).
In this solution, we open the input Excel file two times:
openpyxl
: this is useful to keep the original layout (which seems totally discarded when reading as a pandas
dataframe!)pandas
dataframe df
to benefit from pandas
' great API to manipulate/modify the data itself. Note: data modification is handier with pandas
than with openpyxl
because we have vectorization, filtering df[df['foo'] == 'bar']
, direct access to the columns by name df['foo']
, etc.The following code modifies the input file and keeps the layout: the first row "Filter" is not removed and the column width of each colum is not modified.
import pandas as pd
from openpyxl.utils.dataframe import dataframe_to_rows
from openpyxl import load_workbook
wb = load_workbook('test.xlsx') # load as openpyxl workbook; useful to keep the original layout
# which is discarded in the following dataframe
df = pd.read_excel('test.xlsx') # load as dataframe (modifications will be easier with pandas API!)
ws = wb.active
df.iloc[1, 1] = 'hello world' # modify a few things
rows = dataframe_to_rows(df, index=False)
for r_idx, row in enumerate(rows, 1):
for c_idx, value in enumerate(row, 1):
ws.cell(row=r_idx, column=c_idx, value=value)
wb.save('test2.xlsx')
Upvotes: 2
Reputation: 2479
I would recommend xlwings
, as it interfaces with excel's COM interfaces (like built-in vba), so it is more powerful. I never tested the "preservation of filtering or formula", official doc may provide ways.
For my own use, I just build everything into python, filtering, formulas, so I don't even touch the excel sheet.
Demo:
# [step 0] boiler plate stuff
df = pd.DataFrame(
index=pd.date_range("2020-01-01 11:11:11", periods=100, freq="min"),
columns=list('abc'))
df['a'] = np.random.randn(100, 1)
df['b'] = df['a'] * 2 + 10
# [step 1] google xlwings, and pip/conda install xlwings
# [step 2] open a new excel sheet, no need to save
# (basically this code will indiscriminally wipe whatever sheet that is active on your desktop)
# [step 3] magic, ...and things you can do
import xlwings as xw
wb = xw.books.active
ws = wb.sheets.active
ws.range('A1').current_region.options(index=1).value = df
# I believe this preserves existing formatting, HOWEVER, it will destory filtering
if 1:
# show casing some formatting you can do
active_window = wb.app.api.ActiveWindow
active_window.FreezePanes = False
active_window.SplitColumn = 2 # const_splitcolumn
active_window.SplitRow = 1
active_window.FreezePanes = True
ws.cells.api.Font.Name = 'consolas'
ws.api.Rows(1).Orientation = 60
ws.api.Columns(1).Font.Bold = True
ws.api.Columns(1).Font.ColorIndex = 26
ws.api.Rows(1).Font.Bold = True
ws.api.Rows(1).Borders.Weight = 4
ws.autofit('c') # 'c' means columns, autofitting columns
ws.range(1,1).api.AutoFilter(1)
Upvotes: 2