Reputation: 3818
I am trying to read attached xlsx (Click here to download ) file using python openpyxl. However, workbook cannot be loaded. Here is my attempt to open xlsx file in python -
>>> from openpyxl import load_workbook
>>> workbook = load_workbook(filename = "test.xlsx")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python27\lib\site-packages\openpyxl\reader\excel.py", line 136, in load_workbook
_load_workbook(wb, archive, filename, use_iterators, keep_vba)
File "C:\Python27\lib\site-packages\openpyxl\reader\excel.py", line 198, in _load_workbook
keep_vba=keep_vba)
File "C:\Python27\lib\site-packages\openpyxl\reader\worksheet.py", line 332, in read_worksheet
fast_parse(ws, xml_source, string_table, style_table, color_index)
File "C:\Python27\lib\site-packages\openpyxl\reader\worksheet.py", line 320, in fast_parse
parser.parse()
File "C:\Python27\lib\site-packages\openpyxl\reader\worksheet.py", line 137, in parse
dispatcher[tag_name](element)
File "C:\Python27\lib\site-packages\openpyxl\reader\worksheet.py", line 176, in parse_merge
self.ws.merge_cells(mergeCell.get('ref'))
File "C:\Python27\lib\site-packages\openpyxl\worksheet.py", line 815, in merge_cells
raise InsufficientCoordinatesException(msg)
openpyxl.shared.exc.InsufficientCoordinatesException: Range must be a cell range (e.g. A1:E1)
Upvotes: 1
Views: 10748
Reputation: 967
I ran into this issue trying to open every file in a directory ending in *.xlsx . I later found the file that caused the error was named ~$filename.xlsx . I'm guessing that Microsoft indicates that a file is currently opened by creating a file with the same name, prepended with the ~$. Once I closed the file, everything worked as expected.
Upvotes: 1
Reputation: 19497
The problem was that some merged cells were, in fact, merged with themselves. openpyxl expected a merged cell reference always to be a range of cells. A fix for the problem which ignores meaningless merges has been added to the 2.0 branch.
Upvotes: 0
Reputation: 3818
OK Guys.. I have reported this bug to openpyxl developers and they have provided a quick fix on this. Here is the complete thread.
Upvotes: 0
Reputation: 10192
It appears that your .xlsx
file is damaged or permanently corrupted. The reasons could be many. One of them could be that you might have renamed the extension of the file to .xlsx
which would invalidate the file. To confirm this beahviour, please try to open this file in Microsoft Excel.
I tried reading the file through, openpyxl
, xlrd
and pandas
but none of them worked.
>>> import xlrd
>>> xlrd.open_workbook('test.xlsx')
XLRDError: Unsupported format, or corrupt file: Expected BOF record; found '<html> <'
>>> from openpyxl import load_workbook
>>> workbook = load_workbook(filename = "test.xlsx")
InvalidFileException: File is not a zip file
>>> import pandas
>>> pandas.ExcelFile('test.xlsx')
InvalidFileException: File is not a zip file
Upvotes: 4
Reputation: 12178
I like openpyxl
and use it for creating xlsx
documents. It could be a bug or a missing compatibility with excel feature that takes place in your specific document. I would report it to the openpyxl
community
Upvotes: 0
Reputation: 10534
I did never try openpyxl
but I use xlrd
for reading excel files (.xls and .xlsx). its work great.
see the examples and documentation at http://www.python-excel.org/
Upvotes: -1