Reputation: 121
I have a spreadsheet with data and want to know the first and last cell of the range that contains data. I use Python with Openpyxl. The sheet.dimensions property returns the top-left and bottom-right cell of the area of non-empty cells.
How can I put the outcome of sheet.dimensions in 2 variables. Example: result of sheet.dimensions is A2:C20.
I need one variable containing the first cell of the range (in this case A2) and the other variable containing the last cell of the range (in this case C20).
I expected something like sheet.dimensions.first and .last but so far I can not find the answer.
The program to its basics is as follows:
import openpyxl
book = openpyxl.load_workbook('test 1.xlsx', data_only=True)
sheet = book.active
print (sheet.dimensions)
book.save("result test 1.xlsx")
When the program is run it gives the result A2:C20.
Upvotes: 3
Views: 12156
Reputation: 121
I found that openpyxl has the following attributes:
These are exactly the numbers that I need and with these I know how many iterations are needed to process the whole worksheet. Example:
import openpyxl
book = openpyxl.load_workbook('test 1.xlsx', data_only=True)
sheet = book.active
mi_row = sheet.min_row
ma_row = sheet.max_row
mi_col = sheet.min_column
ma_col = sheet.max_column
print(mi_row, mi_col, ma_row, ma_col)
for row in sheet.iter_rows(min_row=mi_row, min_col=mi_col, max_row=ma_row, max_col=ma_col):
for cell in row:
pop = (cell.value)
print(pop)
Upvotes: 9
Reputation: 11
Kinda late but for anyone else using openpyxl, it has some useful utility functions for doing exactly this, and can handle any weird edge cases.
from openpyxl import load_workbook
from openpyxl.utils.cell import range_boundaries, get_column_letter
wb_path = 'test 1.xlsx'
wb = load_workbook(wb_path, data_only=True)
ws = wb.active
# utility functions offer robust handling of excel range strings
test_ranges = [
"A1:C3", # -> A1 and C3
"$A1:C$3", # -> A1 and C3
"A2", # -> A2 and A2
ws.dimensions, # OP's answer
]
for rng in test_ranges:
# where the magic happens
sc, sr, ec, er = range_boundaries(rng)
start_cell = get_column_letter(sc) + str(sr)
end_cell = get_column_letter(ec) + str(er)
print(f"{start_cell} and {end_cell}")
Upvotes: 1
Reputation: 11
Would think just using split
method on the string return would get you started. That will give you two cell addresses, which Openpyxl then has methods to use as arguments to return cell objects (at those addresses).
Syntax is such that there shouldn't be more than one colon, though might need to trap for odd circumstance for a single cell on sheet (perhaps then no colon at all). Would think this might do the job:
upper_left, lower_right = return_string.split(":")
Upvotes: 0