Reputation: 13
I am trying to create a script using python and openpyxl to open up a given excel sheet and merge all cells in a given row together until the script finds a cell containing a string. The row placement is always the same, but the number of columns and the column placement of the strings is not so it needs to be dynamic. Once a new string is found, I want to continue to merge cells until the column that is right before the grand total. There are also cases where the cell doesn't need to be merged, because there is no empty cell in the data set to merge it with.
I found this answer here, which is doing a similar procedure except it is merging rows instead of columns. I was able to refactor part of this to create a list of the cells that have strings in my workbook, but am struggling on next steps. Any thoughts?
import openpyxl
from openpyxl.utils import get_column_letter
from openpyxl import Workbook
wb1 = openpyxl.load_workbook('stackoverflow question.xlsx')
ws1 = wb1.worksheets['ws1']
columns_with_strings = []
merge_row = '3' #the data to merge will always be in this row
for col in range (2, ws1.max_column-1):
for row in merge_row:
if ws1[get_column_letter(col) + merge_row].value != None:
columns_with_strings.append(str(get_column_letter(col) + merge_row)
The above code yields this list which includes the correct cells that contain strings and need to be checked for merging:
['C3', 'F3', 'J3']
This is how the workbook looks now:
This is how I am trying to get it to look in the end:
Upvotes: 0
Views: 460
Reputation: 19497
Based on @timeless' answer I've cleaned the code up a bit to make better use of Python's tools and the openpyxl API
from openpyxl import Workbook
wb = Workbook()
ws = wb.active
ws.append([])
ws.append([])
ws.append([None, None, "Group A", None, None, "Group B", None, None, None, "Group C"])
# get column indices for header cells
headings = [cell.column for cell in next(ws.iter_rows(min_row=3, max_row=3)) if cell.value]
from openpyxl.styles import Alignment, PatternFill, NamedStyle
fill = PatternFill(patternType="solid", fgColor="DDDDDD")
alignment = Alignment(horizontal="center")
header_style = NamedStyle(alignment=alignment, fill=fill, name="Header")
wb.named_styles.append(header_style)
from itertools import zip_longest
# create ranges for merged cells from the list of header cells: the boundary of the first range, is the index of the start of the next minus 1. Use zip_longest for the final value
for start_column, end_column in zip_longest(headings, headings[1:], fillvalue=headings[-1]+1):
ws.cell(3, start_column).style = header_style
ws.merge_cells(start_row=3, end_row=3, start_column=start_column, end_column=end_column-1)
wb.save("merged.xlsx")
Using the API wherever possible generally leads to more manageable and generic code.
Upvotes: 0
Reputation: 37737
To complete your code, you can use worksheet.merge_cells
with worhseet.cell.alignment
:
from openpyxl import load_workbook
from openpyxl.styles import Alignment
wb = load_workbook("tmp/stackoverflow question.xlsx")
ws = wb["Sheet1"]
merge_row = 3
#here, we get the columns idx for every non null cell in row 3
#and after that, we make a text alignment (center) in the last cell
idx_col_strings = [cell.column for cell in ws[merge_row] if cell.value]
ws.cell(3, idx_col_strings[-1]).alignment = Alignment(horizontal="center")
#here, we loop through each range until the last non null cell in row 3
#then, we make a merge as much as the number of transitions (non null => null)
#and finally, we make a text alignement (center) for each cell/merge
for i in range(len(idx_col_strings)-1):
start_col, end_col = idx_col_strings[i], idx_col_strings[i+1]-1
ws.merge_cells(start_row=merge_row, start_column=start_col,
end_row=merge_row, end_column=end_col)
ws.cell(merge_row, start_col).alignment = Alignment(horizontal="center")
wb.save("tmp/stackoverflow answer.xlsx")
BEFORE :
AFTER :
Upvotes: 1
Reputation: 63
To start, if you aren't familiar with openpyxl's merge and unmerge functions, I recommend your read about them in the documentation (https://openpyxl.readthedocs.io/en/stable/usage.html#merge-unmerge-cells) to get a sense of how this works.
Here is base code that should provide the functionality you are wanting, but some values may need tweaked for your device or spreadsheet.
import openpyxl # Necessary imports.
from openpyxl.utils import get_column_letter
from openpyxl.utils.cell import coordinate_from_string
from openpyxl.utils.cell import column_index_from_string
from openpyxl import Workbook
wb1 = openpyxl.load_workbook('stackoverflow question.xlsx') # Start of your code.
ws1 = wb1.worksheets[0]
columns_with_strings = []
merge_row = '3' #the data to merge will always be in this row
for col in range (2, ws1.max_column):
for row in merge_row:
if ws1[get_column_letter(col) + merge_row].value != None:
columns_with_strings.append(str(get_column_letter(col) + merge_row)) # End of your code.
prior_string = columns_with_strings[0] # Set the "prior_string" to be the first detected string.
for cell in columns_with_strings:
coords = coordinate_from_string(cell) # Split "prior_string" into the letter and number components.
if column_index_from_string(coords[0]) >1:
prior = str(get_column_letter(column_index_from_string(coords[0])-1)) + str(coords[1]) # Get the cell that is left of the cell "prior_string"
if prior > prior_string:
ws1.merge_cells(f'{prior_string}:{prior}') # Merge the cells.
prior_string=cell # Set the current string to be the prior string.
ws1.merge_cells(f'{cell}:{get_column_letter(ws1.max_column)+str(coords[1])}') # Merge the last string to the end (the last column).
wb1.save("stackoverflow question.xlsx") # Save the file changes.
I hope this helps to point you in the right direction!
Upvotes: 0