Reputation: 23
I have a folder of numerous .xslx files on my desktop and I am trying to iterate through them one by one to collect the respective sheet names automatically from each workbook.
import openpyxl
import glob
# specifying the path to csv files
path = "C:/Users/X/Desktop/Test"
# csv files in the path
files = glob.glob(path + "/*.xlsx")
sheet_names = []
for x in files:
openpyxl.load_workbook(files)
sheet_names.append(files.sheetnames)
I am getting an error code:
TypeError: expected str, bytes or os.PathLike object, not list
Is there any way to do this iteratively versus one by one if I have all of the workbook names in a list?
Thank you.
I am looking for the sheet names in each respective Excel workbook file.
Upvotes: 1
Views: 873
Reputation: 14354
Try:
sheet_names = []
for x in files:
sheet_names.append(openpyxl.load_workbook(x).sheetnames)
Or, using a list comprehension
:
sheet_names = [openpyxl.load_workbook(x).sheetnames for x in files]
Update:
If you encounter an error message like:
BadZipFile: File is not a zip file
This probably means that the file you are trying to access is corrupted. I.e. openpyxl
is simply unable to open the file, and the remedy to this situation, i.e. fixing the file, I mean, lies outside the code; you will have to take a closer look at that particular file.
However, you can use try ... except
to avoid having such an event interrupt the rest of your procedure. E.g.:
sheet_names = []
for x in files:
try:
sheet_names.append(openpyxl.load_workbook(x).sheetnames)
except Exception as e:
print(f'Unable to process {x}. Error raised', end=' ')
print(e)
E.g. whenever the code encounters a BadZipFile
, now you'll see a print: Unable to process {path_to/some_file.xlsx}. Error raised File is not a zip file
.
Explanation error
The error occurs because you are referencing files
(i.e. a list
), rather than the items inside the list (i.e. x
) in this line:
openpyxl.load_workbook(files) # we need `x` here instead of `files`
Next, in order to actually use the result from the function call, you should have assigned it to a variable, e.g. wb = openpyxl.load_workbook(x)
and then pass it to the list.append
function. I.e.:
sheet_names.append(files.sheetnames) # we would need `wb.sheetnames` here
However, as shown above, you can just put all of this is a one-liner (so, skipping the intermediate assignment to wb
).
Upvotes: 2
Reputation: 1949
See: Pandas: Looking up the list of sheets in an excel file:
import pandas as pd
import os
excelfolder = r'/home/bera/Downloads/'
for root, folder, files in os.walk(excelfolder):
for file in files:
if file.endswith('.xlsx'):
print(file)
fullname = os.path.join(root, file)
print(pd.ExcelFile(fullname).sheet_names)
#Excelfile123.xlsx
#['Sheet1', 'Sheet2', 'Sheet3']
#excelfile2.xlsx
#['A', 'B']
Upvotes: 0