Reputation: 708
Per https://github.com/pandas-dev/pandas/pull/21251/files/09e5b456e1af5cde55f18f903ab90c761643b05a, we should be able to append DataFrames to new XLSX sheets.
Based on the documentation, I tried the following:
>>> import pandas as pd
>>>
... d1 = pd.DataFrame({"A":['Bob','Joe', 'Mark'],
... "B":['5', '10', '20']})
>>> d2 = pd.DataFrame({"A":['Jeffrey','Ann', 'Sue'],
... "B":['1', '2', '3']})
>>>
>>> # Create XLSX document for ticker
... writer = pd.ExcelWriter('test.xlsx',engine='openpyxl')
>>> d1.to_excel(writer,sheet_name='d1')
>>> writer.save()
>>>
>>> writer = pd.ExcelWriter('test.xlsx',engine='openpyxl', mode='a')
>>> d2.to_excel(writer,sheet_name='d2')
>>> writer.save()
>>>
>>> pd.__version__
'0.23.4' # Just updated this per a comment
>>>
>>>
The result is a single workbook named 'test.xlsx' with a single tab 'd2'.
How can I prevent the workbook/sheet form being overwritten?
Upvotes: 9
Views: 41726
Reputation: 21
This worked for me, it creates a file if the file does not exists, and append to the end of the file if it already exists
you may need to install openpyxl
import pandas as pd
from openpyxl import load_workbook
def append_xlsx(df, file = 'results.xlsx'):
ext = '.xlsx'
if ext not in file:
file+=ext
if os.path.exists(file):
mode="a"
if_sheet_exists="overlay"
header = False
wb = load_workbook(file)
sheet = wb.worksheets[0]
startrow = sheet.max_row
else:
mode='w'
if_sheet_exists = None
header = True
startrow = 0
with pd.ExcelWriter(
file,
mode=mode,
engine="openpyxl",
if_sheet_exists=if_sheet_exists,
) as writer:
df.to_excel(
writer,
sheet_name="Sheet1",
startrow=startrow,
header=header,
index=False,
encoding='utf8'
)
Upvotes: 2
Reputation: 36
import pandas as pd
writer = pd.ExcelWriter(wk_path + save_file)
# ....
# build sc_files DataFrame and save. sc_files includes
# a column called OS.
sc_file.to_excel(writer, sheet_name='test')
# build data frame of OS counts out of sc_file
counts_os = sc_file.OS.value_counts()
# To append to 'test' sheet, use startcol=x1, startrow=y
# To append counts_OS to the end of the current 'test' sheet
y = len(sc_file)
y += 1
counts_os.to_excel(writer, sheet_name='test',
startcol=1, startrow=y)
# write counts_os to sheet test2
counts_os.to_excel(writer, sheet_name='test2')
writer.save()
writer.close()
Upvotes: 1
Reputation: 708
I submitted a post on GitHub and received a response from the contributors (see the highlighted portion below). It turns out that this functionality won't be released until 0.24
so it is not available in 0.23.1
. FYI - I downloaded the RC and successfully tried out the mode='a'
option. However, there may be a bug with workbooks that do not exist; I receive FileNotFoundError: [Errno 2] No such file or directory: 'test.xlsx'
.
"this feature is being released as part of 0.24 which we just issued a release candidate for over the past few days. You can try on the RC or here on master and if neither works open an issue per the contributing guide, but this wouldn't be expected to work on versions older than that"
Upvotes: 0
Reputation: 14113
You can use with
:
with pd.ExcelWriter('test.xlsx', engine='openpyxl', mode='a') as writer:
d1.to_excel(writer,sheet_name='d1')
d2.to_excel(writer,sheet_name='d2')
writer.save()
writer.close()
This should work just note that the a blank file needs to be created before hand. You can just create a blank file using python if you want. I created a simple loop to, in some ways, mimic the essence of what you are trying to accomplish:
import pandas as pd
from openpyxl import load_workbook
d1 = pd.DataFrame({"A":['Bob','Joe', 'Mark'],
"B":['5', '10', '20']})
d2 = pd.DataFrame({"A":['Jeffrey','Ann', 'Sue'],
"B":['1', '2', '3']})
dfs = [d1,d2]
for i in range(len(dfs)):
sheet = 'd'+str(i+1)
data = dfs[i]
writer = pd.ExcelWriter('atest.xlsx',engine='openpyxl', mode='a')
writer.book = load_workbook('atest.xlsx') # here is the difference
data.to_excel(writer,sheet_name=sheet)
writer.save()
writer.close()
or here is the modified first example:
d1 = pd.DataFrame({"A":['Bob','Joe', 'Mark'],
"B":['5', '10', '20']})
d2 = pd.DataFrame({"A":['Jeffrey','Ann', 'Sue'],
"B":['1', '2', '3']})
writer = pd.ExcelWriter('atest.xlsx', engine='openpyxl', mode='w')
d1.to_excel(writer,sheet_name='d1')
writer.save()
writer.close()
writer = pd.ExcelWriter('atest.xlsx', engine='openpyxl', mode='a')
writer.book = load_workbook('atest.xlsx')
d2.to_excel(writer,sheet_name='d2')
writer.save()
writer.close()
Upvotes: 11