Reputation: 105
I've seen other responses but I don't think they work for my specific issue.
I have an excel sheet with 3 sheets.
first sheet has
Name Dept
John Smith candy
Diana Princ candy
Tyler Perry candy
Perry Plat wood
Jerry Springer clothes
Calvin Klein clothes
Mary Poppins clothes
Ivan Evans clothes
Lincoln Tun warehouse
Oliver Twist kitchen
Herman Sherman kitchen
second sheet is:
Depts Responded
candy
wood
clothes
warehouse
kitchen
Third sheet is:
Depts Not Responded
chocolate
vanilla
computers
I want to update my spreadsheet with Python, such as adding chocolate to the Depts Responded sheet. So far my code is like this:
from openpyxl import load_workbook
import pandas as pd
import os, csv, sys
excel_data = 'file.xlsx'
responded = ['candy', 'wood', 'clothes', 'warehouse', 'kitchen','chocolate']
not_responded = ['vanilla', 'computers']
test_nr = pd.DataFrame(not_responded)
test_r = pd.DataFrame(responded)
test_r = notemp_df.rename({0: 'Depts Responded'}, axis=1)
test_nr = empty_df.rename({0: 'Depts Not Responded'}, axis=1)
test_nr = test_nr.reset_index(drop=True)
test_r = test_r.reset_index(drop=True)
# create excel writer object
writer = pd.ExcelWriter(excel_data)
test_nr.to_excel(writer, 'Departments Not Responded')
test_r.to_excel(writer, 'Departments Responded')
# save the excel file
writer.save()
writer.close()
But when I run this, it deletes the first sheet, and only writes the second and third sheet like this:
Depts Responded
0 candy
1 wood
2 clothes
3 warehouse
4 kitchen
5 chocolate
Depts Not Responded
0 vanilla
1 computers
So that part is working, but I'm losing the first sheet. Is there a way to only edit those sheets and not erase the first one? And also removed the index numbers of the new data.
Thanks to one of the responses below by Redox, I dug more into openpyxl and found an answer that worked for me. As the data is just an example based off my real data. Here is my working code:
from openpyxl import load_workbook
import openpyxl
import pandas as pd
import os, csv, sys
def delete(sheet):
while(sheet.max_row > 1):
sheet.delete_rows(2)
return
excel_data = 'file.xlsx'
responded = ['candy', 'wood', 'clothes', 'warehouse', 'kitchen','chocolate']
not_responded = ['vanilla', 'computers']
test_nr = pd.DataFrame(not_responded)
test_r = pd.DataFrame(responded)
test_r = notemp_df.rename({0: 'Depts Responded'}, axis=1)
test_nr = empty_df.rename({0: 'Depts Not Responded'}, axis=1)
test_nr = test_nr.reset_index(drop=True)
test_r = test_r.reset_index(drop=True)
workbook = openpyxl.load_workbook(excel_data)
ws = workbook['Depts Not Responded'] # Go to the required worksheet
delete(ws)
max1 = ws.max_row
for row, entry in enumerate(not_responded, start=1):
ws.cell(row=row+max1, column=1, value=entry)
workbook.save('output.xlsx') #Save it back
wt = workbook['Depts Responded'] # Go to the required worksheet
delete(wt)
max2 = wt.max_row
for row, entry in enumerate(responded, start=1):
wt.cell(row=row+max2, column=1, value=entry)
workbook.save('output.xlsx') #Save it back
Upvotes: 0
Views: 1276
Reputation: 10017
If what you are trying to do is add a new entry(row) to an existing sheet, the simplest and correct way would be to open the existing file, get to the right worksheet/tab, get to the last row with data and add the row to the row below that. With opwnpyxl.load_workbook(), you can do that.
As you might be adding entries on a daily basis, it is not a good idea to have all the data in the python list and rewrite the same each time. Your code is actually creating new tabs with all the data from scratch each time you run it.
In the below example, I have the file you mentioned (filex.xlsx) with 3 worksheets (Sheet1/2/3) and the data as per the question. The below code will open this workbook and append 'chocolate' to Sheet2 at the bottom of the list. It will NOT remove/replace any existing data.
import openpyxl
workbook = openpyxl.load_workbook('file.xlsx') #Open the file
ws = workbook['Sheet2'] # Go to the required worksheet
ws.cell(row=ws.max_row+1, column=1).value = 'chocolate' #Add the value to cell - max_row is the last row with data in it....
workbook.save('output.xlsx') #Save it back
Output after running above code
Upvotes: 1