Emil Zaharia-Kezdi
Emil Zaharia-Kezdi

Reputation: 51

Openpyxl: 'ValueError: Max value is 14' when using load_workbook

I have attempted to open an excel file in which I need to insert dataframes to certain sheets, while leaving other sheets alone. The script works fine when I tested it on other excel files. When I use it on the one I actually need, I get an error message.

Here is the script:

from openpyxl import load_workbook
book = load_workbook(self.directory)

Self.directory refers to my file location. As you can see in the traceback, it fails already at this line when trying to execute load_workbook(), and gives the following error message:

ValueError: Max value is 14

Here is the relevant traceback (I left the directory locations starting with the virtual environment folder 'virtual'):

"""
book = load_workbook(self.directory)
virtual\lib\site-packages\openpyxl\reader\excel.py", line 217, in load_workbook
shared_strings = read_string_table(archive.read(strings_path))
virtual\lib\site-packages\openpyxl\reader\strings.py", line 22, in read_string_table
text = Text.from_tree(node).content
virtual\lib\site-packages\openpyxl\descriptors\serialisable.py", line 84, in from_tree
obj = desc.expected_type.from_tree(el)
virtual\lib\site-packages\openpyxl\descriptors\serialisable.py", line 84, in from_tree
obj = desc.expected_type.from_tree(el)
virtual\lib\site-packages\openpyxl\styles\fonts.py", line 110, in from_tree
return super(Font, cls).from_tree(node)
virtual\lib\site-packages\openpyxl\descriptors\serialisable.py", line 100, in from_tree
return cls(**attrib)
virtual\lib\site-packages\openpyxl\cell\text.py", line 114, in __init__
self.family = family
virtual\lib\site-packages\openpyxl\descriptors\nested.py", line 36, in __set__ 6, in __set__
super(Nested, self).__set__(instance, value)
virtual\lib\site-packages\openpyxl\descriptors\base.py", line 110, in __set__ , in __set__ 
super(Min, self).__set__(instance, value)
virtual\lib\site-packages\openpyxl\descriptors\base.py", line 89, in __set__ in __set__
raise ValueError('Max value is {0}'.format(self.max))
ValueError: Max value is 14
"""

I realized that the excelfile I was using is over the limit of self.max.

I tried sifting through the the openpyxl scripts myself, but I could not manage to figure out what self.max refers to, or how I can change my Excel File so that I can load the workbook.

Can anyone point me to the right direction?

Thanks in advance!

Upvotes: 5

Views: 9629

Answers (11)

pymen
pymen

Reputation: 6569

For Pandas uses - how to clear XLSX styles completely

which will fix all kind of style errors like 'ValueError: Max value is 14'

1. Add your own XLSX engine for pandas in one of your init.py files which is loaded automatically

from openpyxl.reader.excel import ExcelReader
from openpyxl.xml import constants as openpyxl_xml_constants
from pandas import ExcelFile
from pandas.io.excel._openpyxl import OpenpyxlReader

class OpenpyxlReaderWOFormatting(OpenpyxlReader):
    """OpenpyxlReader without reading formatting
    - this will decrease number of errors and speedup process
    error example https://stackoverflow.com/q/66499849/1731460 """

    def load_workbook(self, filepath_or_buffer, *args, **kwargs):
        """Same as original but with custom archive reader"""
        reader = ExcelReader(filepath_or_buffer, read_only=True, data_only=True, keep_links=False)
        reader.archive.read = self.read_exclude_styles(reader.archive)
        reader.read()
        return reader.wb

    def read_exclude_styles(self, archive):
        """skips addings styles to xlsx workbook , like they were absent
        see logic in openpyxl.styles.stylesheet.apply_stylesheet """

        orig_read = archive.read

        def new_read(name, pwd=None):
            if name == openpyxl_xml_constants.ARC_STYLE:
                raise KeyError
            else:
                return orig_read(name, pwd=pwd)

        return new_read

ExcelFile._engines['openpyxl_wo_formatting'] = OpenpyxlReaderWOFormatting
print('Monkey patching pandas XLS engines. See OpenpyxlReaderWOFormatting')

2. Use it, specify engine='openpyxl_wo_formatting' when you read XLSX file

import pandas as pd
df = pd.read_excel(filepath, engine='openpyxl_wo_formatting')

Upvotes: 5

Priyanka Mahawar
Priyanka Mahawar

Reputation: 1

If you more than 14 files in a workbook this error will appear. So, try to paste the content of excel into a new file and then try to load it. This resolved my issue. Hope it resolves yours as well.

Upvotes: 0

sytech
sytech

Reputation: 41159

Instead of patching the __set__ method you can patch the specific descriptor's max value.

This approach is more surgical in that it only patches the font family descriptor that is causing the error. By comparison, if you patch descriptors\base.py as suggested in other answers, you'll be allowing potentially bad values for all descriptors, not just font family.

# IMPORTANT, you must do this before importing openpyxl
from unittest import mock
# Set max font family value to 100
p = mock.patch('openpyxl.styles.fonts.Font.family.max', new=100)
p.start()
import openpyxl
openpyxl.open('my-bugged-worksheet.xlsx') # this works now!

This approach uses mock but in principle you can also patch the .py file where the openpyxl.styles.fonts.Font.family.max descriptor is contained.

Upvotes: 3

Nitin Nain
Nitin Nain

Reputation: 5493

I had the same issue — unsupported font in the file.

I just opened the file in MS Excel, reset the fonts to a basic Arial and saved a new file. And Openpyxl accepted the new file.

Upvotes: 1

Mangosteen
Mangosteen

Reputation: 1

Just comment out the line of code in openpyxl which raises the error.

Upvotes: -1

Asad Sohail
Asad Sohail

Reputation: 11

Issue is resolved if you suppress/commentout the exception like shown below in openpyxl:

def __set__(self, instance, value):
        if ((self.allow_none and value is not None)
            or not self.allow_none):
            value = _convert(self.expected_type, value)
            if value > self.max:
                self.max=self.max
                #raise ValueError('Max value is {0}'.format(self.max))
        super(Max, self).__set__(instance, value)

It resolved the issue and now I am able to use

pd.read_excel(io.BytesIO(obj['Body'].read()), engine='openpyxl', sheet_name=[0], header=None)

Upvotes: 1

Jing Cao
Jing Cao

Reputation: 11

It is an excel file generated WPS but not MS office.

  1. you can use xlwings to open it.
  2. you can save to CSV file manually and read.

Upvotes: 1

Mike Divot
Mike Divot

Reputation: 31

Here's what fixed this error for me. I edited lib\site-packages\openpyxl\descriptors\base.py and added a print statement after line 86 in class Max like so:

def __set__(self, instance, value):
    if ((self.allow_none and value is not None)
        or not self.allow_none):
        value = _convert(self.expected_type, value)
        if value > self.max:
            print(f"value is {value}")
            raise ValueError('Max value is {0}'.format(self.max))
    super(Max, self).__set__(instance, value)

This printed the value of 34 which is obviously higher than the max value of 14 (it's a font family value).

I then saved a copy of my Excel spreadsheet with a .zip extension, extracted all of the XML files, and then used grep searching for val="34". This led me to 3 cells which somehow had font-family=34 in them. I changed the font to something else in Excel, saved the spreadsheet, then changed it back to the original font (Arial) and saved.
After all of this, the error was gone.

Upvotes: 3

CarpetLoup
CarpetLoup

Reputation: 31

I had to remove all formatting in a sheet I was working with.

In Libreoffice; select all, "clear direct formatting"

Upvotes: 3

Ercan Dogan
Ercan Dogan

Reputation: 31

the number of comment boxed didn't solve my problem. I had to remove some worksheets until I got below 14 worksheets in total to be able to open/read the document.

Upvotes: 0

Iniq
Iniq

Reputation: 13

I was able to get around the error after removing the 15th+ little "comment boxes" on the sheet.

Upvotes: 1

Related Questions