Reputation: 721
Getting trouble with either of errors ; writer.book=book
AttributeError: can't set attribute 'book'
or BadZipFile
for code not giving badzipfile error I put code line which writes excel file first, dataOutput=pd.DataFrame(dictDataOutput,index=[0])
but, even though I cannot get rid of writer.book = book AttributeError: can't set attribute 'book'
As one of SO answer suggests I need to bring back openpyxl to previous versions, or to work with CSV file not excel. I think that's not the solution. There should be solution which I could not get in
dataOutput=pd.DataFrame(dictDataOutput,index=[0])
dataOutput.to_excel('output.xlsx') 'output.xlsm'
book = load_workbook('output.xlsx') 'output.xlsm'
writer = pd.ExcelWriter('output.xlsx')OR'output.xlsm'#,engine='openpyxl',mode='a',if_sheet_exists='overlay')
writer.book = book
writer.sheets = {ws.title: ws for ws in book.worksheets}
for sheetname in writer.sheets:
dataOutput.to_excel(writer,sheet_name=sheetname, startrow=writer.sheets[sheetname].max_row, index = False,header= False)
writer.save()
I looked for an answer in enter link description here and in detailed solution of attributeError in enter link description here
---I tried another way
with pd.ExcelWriter('output.xlsx', mode='a',if_sheet_exists='overlay') as writer:
dataOutput.to_excel(writer, sheet_name='Sheet1')
writer.save()
But this time gave another error
FutureWarning: save is not part of the public API, usage can give in unexpected results and will be removed in a future version
writer.save()
after @Andrew's comments I chaned my code to this way;
with pd.ExcelWriter('outputData.xlsm', engine='openpyxl', mode='a', if_sheet_exists='overlay') as writer:
book = load_workbook('outputData.xlsm', keep_vba=True)
writer.book = book
writer.sheets = {ws.title: ws for ws in book.worksheets}
current_sheet = book['Sheet1']
Column_A = current_sheet['A']
maxrow = max(c.row for c in Column_A if c.value is not None)
for sheetname in writer.sheets:
AllDataOutput.to_excel(writer, sheet_name=sheetname, startrow=maxrow, index=False, header=False)
Upvotes: 3
Views: 5783
Reputation: 799
There are a few things that are unclear about your question, but I'll do my best to try and answer what I think you're trying to do.
First off, things that won't work:
to_excel
docs, "If you wish to write to more than one sheet in the workbook, it is necessary to specify an ExcelWriter object", so when you run dataOutput.to_excel('output.xslx')
, you're creating an excel workbook with only a single sheet. Later, it seems like you're trying to write multiple sheets, but you'll only be able to write a single sheet since that's all that's in the workbook to begin with.ExcelWriter
docs, .save()
is deprecated and .close()
should be used instead. But it's recommended to use a context manager (with ... as ... :
) so you can entirely avoid having to do that.ExcelWriter
uses w
(write) mode, so the BadZipFile
error that you were running into earlier was likely related to the order that things were run. If you had written a book with to_excel
, then loaded it with openpyxl
, then used ExcelWriter
, you'd write, load, then immediately overwrite the file with a blank one. If you then tried loading the book again without recreating it with to_excel
, you'd be trying to load an empty file, which is why that error was thrown.The answers that you linked to in your question seem to use a much older version of pandas
with much different behavior surrounding interaction with Excel files. The latest versions seem to simplify things quite a bit!
As an example solution, I'm going to assume you have two DataFrames
called df1
and df2
that you'd like to write to Sheet_1
and Sheet_2
in output.xlsx
, and you'd like to update the data in those sheets (using the overlay
option) with additional DataFrames
called df3
and df4
. To confirm, I have openpyxl = 3.0.10
and pandas = 1.5.0
installed.
As you have things written currently, there aren't any updates in dataOutput
that would be reflected in output.xslx
, but I'm going to create a dictionary called df_dict
that can hold the DataFrames
and be updated based on the sheets they should be written to. I'm sure you can adjust to suit your needs using whatever data structure you prefer!
df1 = pd.DataFrame(data = {'A': ['a', 'b', 'c'], 'B': ['g','h','i']})
df2 = pd.DataFrame(data = {'C': ['o', 'p', 'q'], 'D': ['u','v','w']})
df_dict = {'Sheet_1': df1, 'Sheet_2': df2}
with pd.ExcelWriter('output.xlsx') as writer:
for sheetname, dfname in df_dict.items():
dfname.to_excel(writer, sheet_name = sheetname)
# Updated data is put in DataFrames
df3 = pd.DataFrame(data = {'A': ['a', 'b', 'c', 'd', 'e', 'f'], 'B': ['g', 'h', 'i', 'j', 'k', 'l']})
df4 = pd.DataFrame(data = {'C': ['o', 'p', 'q', 'r', 's', 't'], 'D': ['u', 'v', 'w', 'x', 'y', 'z']})
# df_dict is updated with the new DataFrames
df_dict = {`Sheet_1`: df3, 'Sheet_2': df4}
# This block now uses the name of the sheet from the workbook itself to get the updated data.
# You could also just run the same block as above and it would work.
with pd.ExcelWriter('output.xlsx', mode = 'a', if_sheet_exists = 'overlay') as writer:
for sheet in writer.sheets:
df_dict[sheet].to_excel(writer, sheet_name = sheet)
Upvotes: 7