Reputation: 13
I have python code that will take data from multiple excel files and display the selected data in a separate file for further analysis. It works:
import os
import openpyxl
folder = 'D:\My Documents\Code\Data'
output_file = 'D:\My Documents\Code\Data\masterfile.xlsx'
output_wb = openpyxl.Workbook()
output_sheet = output_wb.active
output_sheet.title = 'masterfile'
cells = ['A2', 'B2', 'C2']
for filename in os.listdir(folder):
if filename.endswith('.xlsx'):
filename = os.path.join(folder, filename)
workbook = openpyxl.load_workbook(filename)
values = [workbook.active[cell].value for cell in cells]
output_sheet.append(values)
output_wb.save(output_file)
However, it can only take code from the active sheet... Which means if I want to take data from Sheet2 I have to save all the excel files (hundreds of them) with Sheet2 as the active sheet otherwise it wont work. So I thought I would try and set the active sheet in the code itself but it does not work. I have only been learning Python for a month and I am not a coder by trade, but a scientist working with a lot of biological data. This codes does not work and I need help to fix it.
import os
import openpyxl
folder = 'D:\My Documents\Code\Data'
output_file = 'D:\My Documents\Code\Data\masterfile.xlsx'
wb = openpyxl.Workbook()
output_wb = openpyxl.Workbook()
output_sheet = output_wb.active
output_sheet.title = 'masterfile'
cells = ['A2', 'B2', 'C2']
for filename in os.listdir(folder):
if filename.endswith('.xlsx'):
filename = os.path.join(folder, filename)
workbook = openpyxl.load_workbook(filename)
workbook.active = workbook.sheetnames.index('Sheet2')
values = [workbook.active[cell].value for cell in cells]
output_sheet.append(values)
output_wb.save(output_file)
I tried setting the active worksheet as Sheet2 but it only takes data from the active sheet the original excel files are stored under.
I have tried the suggestions in these links but to no avail:
Set the active sheet with openpyxl How to switch between sheets in Excel openpyxl Python to make changes Get sheet by name using openpyxl
Any and all help appreciated.
Upvotes: 1
Views: 92
Reputation: 6630
You are not missing any modules
A quick tutorial on selecting worksheets
The active sheet is the sheet that had focus when the workbook was saved and can be accessed as
sheet = wb.active
where 'wb' is the Openpyxl workbook object. This is also for selecting the default sheet when creating a new workbook.
If you want any other Sheet in the workbook then you can select it by name;
sheet = wb['<sheet name>']
where <sheet name>
is the name (aka 'title' in Openpyxl) of the sheet, noting the names are case sensitive. wb['sheet1']
will return an error if the sheet name is 'Sheet1'.
Further; the names of the sheets are contained in the list wb.sheetnames
i.e. this is a list of the names of the sheets only, as strings.
Therefore you could select every sheet in the workbook by;
for sheetname in wb.sheetnames:
sheet = wb[sheetname]
...
but would probably only use this method if you want to filter sheets selected by the title.
The actual worksheet objects are held in the list wb.worksheets
.
If you want to work on every worksheet in the workbook you can loop through all the Sheets by;
for sheet in wb.worksheets:
print(sheet.title)
# do something with the worksheet object 'sheet' having 'title' ...
therefore with this list a sheet can also be selected using its index as;
sheet = wb.worksheets[X]
where X is the index of the sheet you require.
The Sheets are generally listed in the order they appear in the workbook.
Upvotes: 1