Zach Roy
Zach Roy

Reputation: 25

openpyxl - Compiling identical sheets to new sheet

I have an excel workbook that consists of multiple tabs of the same style of table (same number of columns, same data types). I want to combine (append? stack? outer-join? whichever!) each of the tables into one new worksheet, with the following requirements:

So far my conceptual process looks like this:

  1. Load in the workbook
  2. Get a list of sheets
  3. Loop over the list of sheets I created.
  4. Add a new column A, that contains the name of the sheet in every cell.
  5. Add anew empty sheet
  6. Loop over the list of sheets I created again
  7. Append the data to my new sheet, excluding the first row after the first iteration.
  8. Save the workbook

Here is the code that I have so far:

def combine(path):
    # Load in workbook
    wb = xl.load_workbook(filename=path)

    # Get a list of sheets
    ws_list = wb.sheetnames

    # Loop over the list of sheets I created
    for i, ws in enumerate(ws_list):
        # Add a new column A
        wb[ws].insert_cols(idx=0)  
        for column in wb[ws]['A{0}:A{1}'.format(wb[ws].min_row, wb[ws].max_row)]:
            for cell in column:
                # that contains the name of the sheet in every cell
                cell.value = str(ws)

    # Add a new empty sheet
    wb.create_sheet(title=u'COMBINED',index=0)
    # Loop over the list of sheets I created againv
    for i, ws in enumerate(ws_list):
        # Append the data (currently should copy headers as well)
        wb['COMBINED'].append(wb[ws].rows)

    # Save workbook
    wb.save(path) 

And this is the error I get

Note that 'Fall 03' is the first tab name, and is the text added to the first column of the first tab. Because of this I'm assuming that the process is working up through step 5.

Traceback (most recent call last):
  File "test.py", line 194, in <module>
    main()
  File "test.py", line 188, in main
    combine(excel_path)
  File "test.py", line 166, in combine
    wb['COMBINED'].append(wb[ws].rows)                          <- my script
  File "C:\Python27\ArcGISx6410.5\lib\site-packages\openpyxl\worksheet\worksheet.py", line 777, in append
    cell = Cell(self, row=row_idx, col_idx=col_idx, value=content)
  File "C:\Python27\ArcGISx6410.5\lib\site-packages\openpyxl\cell\cell.py", line 115, in __init__
    self.value = value
  File "C:\Python27\ArcGISx6410.5\lib\site-packages\openpyxl\cell\cell.py", line 294, in value
    self._bind_value(value)
  File "C:\Python27\ArcGISx6410.5\lib\site-packages\openpyxl\cell\cell.py", line 207, in _bind_value
    raise ValueError("Cannot convert {0!r} to Excel".format(value))
ValueError: Cannot convert (<Cell u'Fall 03'.A1>, <Cell u'Fall 03'.B1>, <Cell u'Fall 03'.C1>, <Cell u'Fall 03'.D1>, <Cell u'Fall 03'.E1>, <Cell u'Fall 03'.F1>, <Cell u'Fall 03'.G1>, <Cell u'Fall 03'.H1>, <Cell u'Fall 03'.I1>, <Cell u'Fall 03'.J1>, <Cell u'Fall 03'.K1>, <Cell u'Fall 03'.L1>, <Cell u'Fall 03'.M1>, <Cell u'Fall 03'.N1>, <Cell u'Fall 03'.O1>, <Cell u'Fall 03'.P1>, <Cell u'Fall 03'.Q1>, <Cell u'Fall 03'.R1>, <Cell u'Fall 03'.S1>, <Cell u'Fall 03'.T1>, <Cell u'Fall 03'.U1>, <Cell u'Fall 03'.V1>, <Cell u'Fall 03'.W1>, <Cell u'Fall 03'.X1>, <Cell u'Fall 03'.Y1>, <Cell u'Fall 03'.Z1>, <Cell u'Fall 03'.AA1>, <Cell u'Fall 03'.AB1>, <Cell u'Fall 03'.AC1>, <Cell u'Fall 03'.AD1>, <Cell u'Fall 03'.AE1>, <Cell u'Fall 03'.AF1>, <Cell u'Fall 03'.AG1>, <Cell u'Fall 03'.AH1>, <Cell u'Fall 03'.AI1>, <Cell u'Fall 03'.AJ1>, <Cell u'Fall 03'.AK1>, <Cell u'Fall 03'.AL1>) to Excel

If I can be picky

So far I am trying to stick to openpyxl, because I know it is installed on all the machines I will need to use. I can also use any module that is included in the standard ArcGIS 10.5 ArcPy distribution.

I have looked at these solutions, and none of them seem to work:

  1. This one seems to use the '.rows' iterable incorrectly
  2. This one worn't work because I don't know the number of rows in the existing tables.
  3. This one won't work because they are asking to copy between tables, which is a non-trivial distiction for some reason.

Upvotes: 0

Views: 246

Answers (1)

Charlie Clark
Charlie Clark

Reputation: 19497

You can't do this wb['COMBINED'].append(wb[ws].rows). You can only append a sequence of simple items such as numbers or strings and you're trying to pass in sequence of tuples containing Cell objects. for row in ws1.values: ws.append(row) would be possible.

Your code is also over complex: you can simply loop over the sheets in a workbook: for ws in wb and use ws.iter_rows() and ws.iter_cols() for parametric access.

Upvotes: 1

Related Questions