Reputation: 1277
I'm using openpyxl
to deal with Excel sheets. It works fine, but then I encountered a file that gives me the following error:
Traceback (most recent call last):
File "/home/ute/OM/Python_Scripts/preparePlanFileFromExcelReport.py", line 13, in <module>
wb = load_workbook(differenceReportFile)
File "/usr/local/lib/python2.7/dist-packages/openpyxl/reader/excel.py", line 151, in load_workbook
archive = _validate_archive(filename)
File "/usr/local/lib/python2.7/dist-packages/openpyxl/reader/excel.py", line 118, in _validate_archive
archive = ZipFile(f, 'r', ZIP_DEFLATED)
File "/usr/lib/python2.7/zipfile.py", line 714, in __init__
self._GetContents()
File "/usr/lib/python2.7/zipfile.py", line 748, in _GetContents
self._RealGetContents()
File "/usr/lib/python2.7/zipfile.py", line 763, in _RealGetContents
raise BadZipfile, "File is not a zip file"
zipfile.BadZipfile: File is not a zip file
After some search, I found this error pops if your file is not a valid xlsx file.
I can open the file normally with MS Excel 2013, but how can I tell if this file is a valid xlsx file?
Upvotes: 1
Views: 5374
Reputation: 14529
Your question is kind of self-answering: Your error message already tells you that (1) OpenPyXL cannot open the file, and (2) the reason is that the file isn't a valid zip file (and thus not a valid .xlsx file).
If for some reason you need the program to continue even though the file is invalid, you can use the usual try..except
mechanism:
import openpyxl
from zipfile import BadZipfile
try:
wb = load_workbook(differenceReportFile)
except BadZipfile:
print 'Invalid zip file.'
# continue processing here
If you want to handle the possibility that the .xlsx file is really a .xls file, but simply misnamed, then you can use xlrd to read the file instead (it handles both .xls and .xlsx).
If you want to be able to read ANY file that Excel can read (regardless of the file extension), your only realistic choice is to have Excel itself open the file, which you can do using the COM interface (PyWin32, pywinauto, xlwings, etc.).
Upvotes: 0
Reputation: 19507
If it really isn't a zip file then it really isn't an Excel file as this is part of the specification. However, Excel will treat some files that are not actually Excel files as if they were. Some libraries use this for example to export a special kind of HTML that Excel can read.
If you think that the file is correct and that the problem is with openpyxl then please submit a bug report together with a sample file.
Upvotes: 1