Reputation: 61
I'm coding a Python 3.7.2 script to check for unused worksheets in an Excel file, and I need a way to determine if the current worksheet contains no values. Previously when I coded in VBA, I would find the last-used-row and last-used-column to check if the sheet was 'empty', but the max_column and max_row functions in openpyxl take into account non-value data like formatting which I want to ignore. Is there any way to check if none of the cells of a given worksheet contain any values?
Previously when I coded in VBA, I would find the last-used-row and last-used-column to check if the sheet was 'empty', but the max_column and max_row functions in openpyxl take into account non-value data like formatting which I want to ignore.
Ideally I'd be able to find a way to get the last_used_column and last_used_row like I used to in VBA, but I'd appreciate it if anyone could tell me of any tools I could use to build a simple function where you pass it a worksheet and it spits out True or False depending on whether it has no values or not.
Upvotes: 1
Views: 2738
Reputation: 1
This seems to be a persistent, vexing problem with Python3 and OpenPyXL; hopefully the dev teams will address this more elegantly in future releases. In the mean time, my inelegant solution relies on the max_column max_row tests described by previous posters:
for sheet in workBook.worksheets:
if (sheet.max_column == 1) and (sheet.max_row == 1):
# see if cell is empty
if sheet.cell(row=1,column=1).value == None:
print('Sheet ' + sheet.title + ' is empty. Ignoring ... ')
continue
do other stuff
I inserted this if test in a script that converted any *.xlsx? files in a source directory to CSV files to get the processing past a "TypeError: 'tuple'" error. The script would work fine until it encountered any empty sheet in a workbook; then it would fail. Those simple few lines solved the issue.
Upvotes: 0
Reputation: 8047
In contrast to my suggestion in a comment, ws.get_cell_collection()
is deprecated, but you could iterate over the openpyxl.worksheet.worksheet
module's
values property, since (from the docs):
values
Produces all cell values in the worksheet, by rowType: generator
With a workbook workbook.xlsx
containing two worksheets, Sheet1
does contain some cell with values, but Sheet2
is visibly empty with all empty cells:
wb = load_workbook('workbook.xlsx')
for name in wb.get_sheet_names():
isempty = False
for row in wb.get_sheet_by_name(name).values:
if all(value == None for value in row):
isempty = True
else:
isempty = False
if isempty:
print(name)
outputs the empty sheet(s):
Sheet2
Although this works it is a bit more complicated than using only:
empty_sheets = [name for name in wb.get_sheet_names() if not (wb.get_sheet_by_name(name).get_cell_collection())]
which still works albeit deprecated.
Upvotes: 2