Marionet
Marionet

Reputation: 121

Python Openpyxl sheet.dimensions

I have a spreadsheet with data and want to know the first and last cell of the range that contains data. I use Python with Openpyxl. The sheet.dimensions property returns the top-left and bottom-right cell of the area of non-empty cells.
How can I put the outcome of sheet.dimensions in 2 variables. Example: result of sheet.dimensions is A2:C20.
I need one variable containing the first cell of the range (in this case A2) and the other variable containing the last cell of the range (in this case C20).
I expected something like sheet.dimensions.first and .last but so far I can not find the answer.

the spreadsheet

The program to its basics is as follows:

import openpyxl

book = openpyxl.load_workbook('test 1.xlsx', data_only=True)
sheet = book.active

print (sheet.dimensions)

book.save("result test 1.xlsx")

When the program is run it gives the result A2:C20.

Upvotes: 3

Views: 12156

Answers (3)

Marionet
Marionet

Reputation: 121

I found that openpyxl has the following attributes:

  • sheet.min_row
  • sheet.max_row
  • sheet.min_column
  • sheet.max_column

These are exactly the numbers that I need and with these I know how many iterations are needed to process the whole worksheet. Example:

import openpyxl

book = openpyxl.load_workbook('test 1.xlsx', data_only=True)
sheet = book.active

mi_row = sheet.min_row
ma_row = sheet.max_row
mi_col = sheet.min_column
ma_col = sheet.max_column

print(mi_row, mi_col, ma_row, ma_col)

for row in sheet.iter_rows(min_row=mi_row, min_col=mi_col, max_row=ma_row, max_col=ma_col):
    for cell in row:
        pop = (cell.value)
        print(pop)

Upvotes: 9

alexander-paskal
alexander-paskal

Reputation: 11

Kinda late but for anyone else using openpyxl, it has some useful utility functions for doing exactly this, and can handle any weird edge cases.

from openpyxl import load_workbook
from openpyxl.utils.cell import range_boundaries, get_column_letter

wb_path = 'test 1.xlsx'
wb = load_workbook(wb_path, data_only=True)
ws = wb.active


# utility functions offer robust handling of excel range strings
test_ranges = [
    "A1:C3",  # -> A1 and C3
    "$A1:C$3",  # -> A1 and C3
    "A2",  # -> A2 and A2
    ws.dimensions,  # OP's answer
]

for rng in test_ranges:
    # where the magic happens
    sc, sr, ec, er = range_boundaries(rng)
    start_cell = get_column_letter(sc) + str(sr)
    end_cell = get_column_letter(ec) + str(er)
    print(f"{start_cell} and {end_cell}")

Upvotes: 1

Henri Chat Noir
Henri Chat Noir

Reputation: 11

Would think just using split method on the string return would get you started. That will give you two cell addresses, which Openpyxl then has methods to use as arguments to return cell objects (at those addresses).

Syntax is such that there shouldn't be more than one colon, though might need to trap for odd circumstance for a single cell on sheet (perhaps then no colon at all). Would think this might do the job:

upper_left, lower_right = return_string.split(":")

Upvotes: 0

Related Questions