Reputation: 51
I have attempted to open an excel file in which I need to insert dataframes to certain sheets, while leaving other sheets alone. The script works fine when I tested it on other excel files. When I use it on the one I actually need, I get an error message.
Here is the script:
from openpyxl import load_workbook
book = load_workbook(self.directory)
Self.directory refers to my file location. As you can see in the traceback, it fails already at this line when trying to execute load_workbook(), and gives the following error message:
ValueError: Max value is 14
Here is the relevant traceback (I left the directory locations starting with the virtual environment folder 'virtual'):
"""
book = load_workbook(self.directory)
virtual\lib\site-packages\openpyxl\reader\excel.py", line 217, in load_workbook
shared_strings = read_string_table(archive.read(strings_path))
virtual\lib\site-packages\openpyxl\reader\strings.py", line 22, in read_string_table
text = Text.from_tree(node).content
virtual\lib\site-packages\openpyxl\descriptors\serialisable.py", line 84, in from_tree
obj = desc.expected_type.from_tree(el)
virtual\lib\site-packages\openpyxl\descriptors\serialisable.py", line 84, in from_tree
obj = desc.expected_type.from_tree(el)
virtual\lib\site-packages\openpyxl\styles\fonts.py", line 110, in from_tree
return super(Font, cls).from_tree(node)
virtual\lib\site-packages\openpyxl\descriptors\serialisable.py", line 100, in from_tree
return cls(**attrib)
virtual\lib\site-packages\openpyxl\cell\text.py", line 114, in __init__
self.family = family
virtual\lib\site-packages\openpyxl\descriptors\nested.py", line 36, in __set__ 6, in __set__
super(Nested, self).__set__(instance, value)
virtual\lib\site-packages\openpyxl\descriptors\base.py", line 110, in __set__ , in __set__
super(Min, self).__set__(instance, value)
virtual\lib\site-packages\openpyxl\descriptors\base.py", line 89, in __set__ in __set__
raise ValueError('Max value is {0}'.format(self.max))
ValueError: Max value is 14
"""
I realized that the excelfile I was using is over the limit of self.max.
I tried sifting through the the openpyxl scripts myself, but I could not manage to figure out what self.max refers to, or how I can change my Excel File so that I can load the workbook.
Can anyone point me to the right direction?
Thanks in advance!
Upvotes: 5
Views: 9629
Reputation: 6569
For Pandas uses - how to clear XLSX styles completely
which will fix all kind of style errors like 'ValueError: Max value is 14'
1. Add your own XLSX engine for pandas in one of your init.py files which is loaded automatically
from openpyxl.reader.excel import ExcelReader
from openpyxl.xml import constants as openpyxl_xml_constants
from pandas import ExcelFile
from pandas.io.excel._openpyxl import OpenpyxlReader
class OpenpyxlReaderWOFormatting(OpenpyxlReader):
"""OpenpyxlReader without reading formatting
- this will decrease number of errors and speedup process
error example https://stackoverflow.com/q/66499849/1731460 """
def load_workbook(self, filepath_or_buffer, *args, **kwargs):
"""Same as original but with custom archive reader"""
reader = ExcelReader(filepath_or_buffer, read_only=True, data_only=True, keep_links=False)
reader.archive.read = self.read_exclude_styles(reader.archive)
reader.read()
return reader.wb
def read_exclude_styles(self, archive):
"""skips addings styles to xlsx workbook , like they were absent
see logic in openpyxl.styles.stylesheet.apply_stylesheet """
orig_read = archive.read
def new_read(name, pwd=None):
if name == openpyxl_xml_constants.ARC_STYLE:
raise KeyError
else:
return orig_read(name, pwd=pwd)
return new_read
ExcelFile._engines['openpyxl_wo_formatting'] = OpenpyxlReaderWOFormatting
print('Monkey patching pandas XLS engines. See OpenpyxlReaderWOFormatting')
2. Use it, specify engine='openpyxl_wo_formatting' when you read XLSX file
import pandas as pd
df = pd.read_excel(filepath, engine='openpyxl_wo_formatting')
Upvotes: 5
Reputation: 1
If you more than 14 files in a workbook this error will appear. So, try to paste the content of excel into a new file and then try to load it. This resolved my issue. Hope it resolves yours as well.
Upvotes: 0
Reputation: 41159
Instead of patching the __set__
method you can patch the specific descriptor's max value.
This approach is more surgical in that it only patches the font family descriptor that is causing the error. By comparison, if you patch descriptors\base.py
as suggested in other answers, you'll be allowing potentially bad values for all descriptors, not just font family.
# IMPORTANT, you must do this before importing openpyxl
from unittest import mock
# Set max font family value to 100
p = mock.patch('openpyxl.styles.fonts.Font.family.max', new=100)
p.start()
import openpyxl
openpyxl.open('my-bugged-worksheet.xlsx') # this works now!
This approach uses mock
but in principle you can also patch the .py
file where the openpyxl.styles.fonts.Font.family.max
descriptor is contained.
Upvotes: 3
Reputation: 5493
I had the same issue — unsupported font in the file.
I just opened the file in MS Excel, reset the fonts to a basic Arial and saved a new file. And Openpyxl accepted the new file.
Upvotes: 1
Reputation: 1
Just comment out the line of code in openpyxl which raises the error.
Upvotes: -1
Reputation: 11
Issue is resolved if you suppress/commentout the exception like shown below in openpyxl:
def __set__(self, instance, value):
if ((self.allow_none and value is not None)
or not self.allow_none):
value = _convert(self.expected_type, value)
if value > self.max:
self.max=self.max
#raise ValueError('Max value is {0}'.format(self.max))
super(Max, self).__set__(instance, value)
It resolved the issue and now I am able to use
pd.read_excel(io.BytesIO(obj['Body'].read()), engine='openpyxl', sheet_name=[0], header=None)
Upvotes: 1
Reputation: 11
It is an excel file generated WPS but not MS office.
Upvotes: 1
Reputation: 31
Here's what fixed this error for me. I edited lib\site-packages\openpyxl\descriptors\base.py and added a print statement after line 86 in class Max like so:
def __set__(self, instance, value):
if ((self.allow_none and value is not None)
or not self.allow_none):
value = _convert(self.expected_type, value)
if value > self.max:
print(f"value is {value}")
raise ValueError('Max value is {0}'.format(self.max))
super(Max, self).__set__(instance, value)
This printed the value of 34 which is obviously higher than the max value of 14 (it's a font family value).
I then saved a copy of my Excel spreadsheet with a .zip extension, extracted all of the XML files, and then used grep searching for val="34". This led me to 3 cells which somehow had font-family=34 in them. I changed the font to something else in Excel, saved the spreadsheet, then changed it back to the original font (Arial) and saved.
After all of this, the error was gone.
Upvotes: 3
Reputation: 31
I had to remove all formatting in a sheet I was working with.
In Libreoffice; select all, "clear direct formatting"
Upvotes: 3
Reputation: 31
the number of comment boxed didn't solve my problem. I had to remove some worksheets until I got below 14 worksheets in total to be able to open/read the document.
Upvotes: 0
Reputation: 13
I was able to get around the error after removing the 15th+ little "comment boxes" on the sheet.
Upvotes: 1