idillus
idillus

Reputation: 271

How to copy worksheet from one workbook to another one using openpyxl?

I have a large amount of EXCEL files (i.e. 200) I would like to copy one specific worksheet from one workbook to another one. I have done some investigations and I couldn't find a way of doing it with Openpyxl

This is the code I have developed so far

def copy_sheet_to_different_EXCEL(path_EXCEL_read,Sheet_name_to_copy,path_EXCEL_Save,Sheet_new_name):
''' Function used to copy one EXCEL sheet into another file.
    
    def path_EXCEL_read,Sheet_name_to_copy,path_EXCEL_Save,Sheet_new_name
    
Input data:
    1.) path_EXCEL_read: the location of the EXCEL file along with the name where the information is going to be saved
    2.) Sheet_name_to_copy= The name of the EXCEL sheet to copy
    3.) path_EXCEL_Save: The path of the EXCEL file where the sheet is going to be copied
    3.) Sheet_new_name: The name of the new EXCEL sheet
    
Output data:
    1.) Status= If 0, everything went OK. If 1, one error occurred.

Version History:
1.0 (2017-02-20): Initial version.

'''
status=0

if(path_EXCEL_read.endswith('.xls')==1): 
    print('ERROR - EXCEL xls file format is not supported by openpyxl. Please, convert the file to an XLSX format')
    status=1
    return status
    
try:
   wb = openpyxl.load_workbook(path_EXCEL_read,read_only=True)
except:
    print('ERROR - EXCEL file does not exist in the following location:\n  {0}'.format(path_EXCEL_read))
    status=1
    return status

Sheet_names=wb.get_sheet_names()    # We copare against the sheet name we would like to cpy

if ((Sheet_name_to_copy in Sheet_names)==0):
    print('ERROR - EXCEL sheet does not exist'.format(Sheet_name_to_copy))
    status=1
    return status   

# We checking if the destination file exists

if (os.path.exists(path_EXCEL_Save)==1):
    #If true, file exist so we open it
    
    if(path_EXCEL_Save.endswith('.xls')==1): 
        print('ERROR - Destination EXCEL xls file format is not supported by openpyxl. Please, convert the file to an XLSX format')
        status=1
    return status
    
    try:
        wdestiny = openpyxl.load_workbook(path_EXCEL_Save)
    except:
        print('ERROR - Destination EXCEL file does not exist in the following location:\n  {0}'.format(path_EXCEL_read))
        status=1
    return status

    #we check if the destination sheet exists. If so, we will delete it
    
    destination_list_sheets = wdestiny.get_sheet_names()
    
    if((Sheet_new_name in destination_list_sheets) ==True):
        print('WARNING - Sheet "{0}" exists in: {1}. It will be deleted!'.format(Sheet_new_name,path_EXCEL_Save))
        wdestiny.remove_sheet(Sheet_new_name) 

else:
    wdestiny=openpyxl.Workbook()
# We copy the Excel sheet
    
try:
    sheet_to_copy = wb.get_sheet_by_name(Sheet_name_to_copy) 
    target = wdestiny.copy_worksheet(sheet_to_copy)
    target.title=Sheet_new_name
except:
    print('ERROR - Could not copy the EXCEL sheet. Check the file')
    status=1
    return status

try:
    wdestiny.save(path_EXCEL_Save)
except:
    print('ERROR - Could not save the EXCEL sheet. Check the file permissions')
    status=1
    return status

#Program finishes
return status

Upvotes: 27

Views: 63087

Answers (12)

Doktop Aibolit
Doktop Aibolit

Reputation: 11

I'd add to @Oscar's answer:

target_sheet.print_options = copy(source_sheet.print_options)
target_sheet.protection = copy(source_sheet.protection)
target_sheet.sheet_state = copy(source_sheet.sheet_state)
target_sheet.views = copy(source_sheet.views)
target_sheet.data_validations = copy(source_sheet.data_validations)

Upvotes: 1

tkuyucu
tkuyucu

Reputation: 1

Using deepcopy instead of looping over source_sheet.row_dimensions in Oscar's solution will get rid of the LibreOffice error: "The data could not be loaded completely because the maximum number of rows per sheet was exceeded."

Upvotes: 0

Andrey Aleksandrov
Andrey Aleksandrov

Reputation: 11

By using openpyxl - not success with border copy. In my case - were successed using xlwings. It open excel in OS, copy tab to other excel, save, rename and close.

import openpyxl, os
import xlwings as xw

def copy_tab(file_old, tab_source, file_new, tab_destination):
    delete_tab = False
    if not os.path.exists(file_new):
        wb_target = openpyxl.Workbook()
        wb_target.save(file_new)
        delete_tab = True

    wb = xw.Book(file_old)
    app = wb.app
    app.visible = False
    sht = wb.sheets[tab_source]
    new_wb = xw.Book(file_new)
    new_app = new_wb.app
    new_app.visible = False
    sht.api.Copy(None, After=new_wb.sheets[-1].api)
    if delete_tab:
        new_wb.sheets['Sheet'].delete()
    wb.close()
    for sheet in new_wb.sheets:
        if tab_destination in sheet.name:
            sheet.delete()
    new_wb.sheets[tab_source].name = tab_destination
    new_wb.save()
    new_wb.close()

if __name__ == "__main__":
    file_old = r"C:\file_old.xlsx"
    file_new = r"C:\file_new.xlsx"

    copy_tab(file_old, "sheet_old", file_new, "sheet_new")

Upvotes: 1

Vedant
Vedant

Reputation: 1

It actually can be done in a very simple way ! It just need 3 steps :

  1. Open a file using load_workbook

    wb = load_workbook('File_1.xlsx')

  2. Select a sheet you want to copy

    ws = wb.active

  3. use name of the new file to save the file

    wb.save('New_file.xlsx')

This code will save sheet of first file (File_1.xlsx) to the secound file (New_file.xlsx).

Upvotes: -1

Charlie Clark
Charlie Clark

Reputation: 19497

You cannot use copy_worksheet() to copy between workbooks because it depends on global constants that may vary between workbooks. The only safe and reliable way to proceed is to go row-by-row and cell-by-cell.

You might want to read the discussions about this feature

Upvotes: 8

abs sh
abs sh

Reputation: 195

i found a way playing around with it

import openpyxl

xl1 = openpyxl.load_workbook('workbook1.xlsx')
# sheet you want to copy
s = openpyxl.load_workbook('workbook2.xlsx').active
s._parent = xl1
xl1._add_sheet(s)
xl1.save('some_path/name.xlsx')

Upvotes: 13

phyatt
phyatt

Reputation: 19102

For speed I am using data_only and read_only attributes when opening my workbooks. Also iter_rows() is really fast, too.

@Oscar's excellent answer needs some changes to support ReadOnlyWorksheet and EmptyCell

# Copy a sheet with style, format, layout, ect. from one Excel file to another Excel file
# Please add the ..path\\+\\file..  and  ..sheet_name.. according to your desire.
import openpyxl
from copy import copy


def copy_sheet(source_sheet, target_sheet):
    copy_cells(source_sheet, target_sheet)  # copy all the cel values and styles
    copy_sheet_attributes(source_sheet, target_sheet)


def copy_sheet_attributes(source_sheet, target_sheet):
    if isinstance(source_sheet, openpyxl.worksheet._read_only.ReadOnlyWorksheet):
        return
    target_sheet.sheet_format = copy(source_sheet.sheet_format)
    target_sheet.sheet_properties = copy(source_sheet.sheet_properties)
    target_sheet.merged_cells = copy(source_sheet.merged_cells)
    target_sheet.page_margins = copy(source_sheet.page_margins)
    target_sheet.freeze_panes = copy(source_sheet.freeze_panes)

    # set row dimensions
    # So you cannot copy the row_dimensions attribute. Does not work (because of meta data in the attribute I think). So we copy every row's row_dimensions. That seems to work.
    for rn in range(len(source_sheet.row_dimensions)):
        target_sheet.row_dimensions[rn] = copy(source_sheet.row_dimensions[rn])

    if source_sheet.sheet_format.defaultColWidth is None:
        print('Unable to copy default column wide')
    else:
        target_sheet.sheet_format.defaultColWidth = copy(source_sheet.sheet_format.defaultColWidth)

    # set specific column width and hidden property
    # we cannot copy the entire column_dimensions attribute so we copy selected attributes
    for key, value in source_sheet.column_dimensions.items():
        target_sheet.column_dimensions[key].min = copy(source_sheet.column_dimensions[key].min)   # Excel actually groups multiple columns under 1 key. Use the min max attribute to also group the columns in the targetSheet
        target_sheet.column_dimensions[key].max = copy(source_sheet.column_dimensions[key].max)  # https://stackoverflow.com/questions/36417278/openpyxl-can-not-read-consecutive-hidden-columns discussed the issue. Note that this is also the case for the width, not onl;y the hidden property
        target_sheet.column_dimensions[key].width = copy(source_sheet.column_dimensions[key].width) # set width for every column
        target_sheet.column_dimensions[key].hidden = copy(source_sheet.column_dimensions[key].hidden)


def copy_cells(source_sheet, target_sheet):
    for r, row in enumerate(source_sheet.iter_rows()):
        for c, cell in enumerate(row):
            source_cell = cell
            if isinstance(source_cell, openpyxl.cell.read_only.EmptyCell):
                continue
            target_cell = target_sheet.cell(column=c+1, row=r+1)

            target_cell._value = source_cell._value
            target_cell.data_type = source_cell.data_type

            if source_cell.has_style:
                target_cell.font = copy(source_cell.font)
                target_cell.border = copy(source_cell.border)
                target_cell.fill = copy(source_cell.fill)
                target_cell.number_format = copy(source_cell.number_format)
                target_cell.protection = copy(source_cell.protection)
                target_cell.alignment = copy(source_cell.alignment)

            if not isinstance(source_cell, openpyxl.cell.ReadOnlyCell) and source_cell.hyperlink:
                target_cell._hyperlink = copy(source_cell.hyperlink)

            if not isinstance(source_cell, openpyxl.cell.ReadOnlyCell) and source_cell.comment:
                target_cell.comment = copy(source_cell.comment)

With a usage something like

    wb = Workbook()
    
    wb_source = load_workbook(filename, data_only=True, read_only=True)
    for sheetname in wb_source.sheetnames:
        source_sheet = wb_source[sheetname]
        ws = wb.create_sheet("Orig_" + sheetname)
        copy_sheet(source_sheet, ws)

    wb.save(new_filename)

Upvotes: 5

Oscar
Oscar

Reputation: 381

I had the same problem. For me style, format, and layout were very important. Moreover, I did not want to copy formulas but only the value (of the formulas). After a lot of trail, error, and stackoverflow I came up with the following functions. It may look a bit intimidating but the code copies a sheet from one Excel file to another (possibly existing file) while preserving:

  1. font and color of text
  2. filled color of cells
  3. merged cells
  4. comment and hyperlinks
  5. format of the cell value
  6. the width of every row and column
  7. whether or not row and column are hidden
  8. frozen rows

It is useful when you want to gather sheets from many workbooks and bind them into one workbook. I copied most attributes but there might be a few more. In that case you can use this script as a jumping off point to add more.

###############
## Copy a sheet with style, format, layout, ect. from one Excel file to another Excel file
## Please add the ..path\\+\\file..  and  ..sheet_name.. according to your desire.

import openpyxl
from copy import copy

def copy_sheet(source_sheet, target_sheet):
    copy_cells(source_sheet, target_sheet)  # copy all the cel values and styles
    copy_sheet_attributes(source_sheet, target_sheet)


def copy_sheet_attributes(source_sheet, target_sheet):
    target_sheet.sheet_format = copy(source_sheet.sheet_format)
    target_sheet.sheet_properties = copy(source_sheet.sheet_properties)
    target_sheet.merged_cells = copy(source_sheet.merged_cells)
    target_sheet.page_margins = copy(source_sheet.page_margins)
    target_sheet.freeze_panes = copy(source_sheet.freeze_panes)

    # set row dimensions
    # So you cannot copy the row_dimensions attribute. Does not work (because of meta data in the attribute I think). So we copy every row's row_dimensions. That seems to work.
    for rn in range(len(source_sheet.row_dimensions)):
        target_sheet.row_dimensions[rn] = copy(source_sheet.row_dimensions[rn])

    if source_sheet.sheet_format.defaultColWidth is None:
        print('Unable to copy default column wide')
    else:
        target_sheet.sheet_format.defaultColWidth = copy(source_sheet.sheet_format.defaultColWidth)

    # set specific column width and hidden property
    # we cannot copy the entire column_dimensions attribute so we copy selected attributes
    for key, value in source_sheet.column_dimensions.items():
        target_sheet.column_dimensions[key].min = copy(source_sheet.column_dimensions[key].min)   # Excel actually groups multiple columns under 1 key. Use the min max attribute to also group the columns in the targetSheet
        target_sheet.column_dimensions[key].max = copy(source_sheet.column_dimensions[key].max)  # https://stackoverflow.com/questions/36417278/openpyxl-can-not-read-consecutive-hidden-columns discussed the issue. Note that this is also the case for the width, not onl;y the hidden property
        target_sheet.column_dimensions[key].width = copy(source_sheet.column_dimensions[key].width) # set width for every column
        target_sheet.column_dimensions[key].hidden = copy(source_sheet.column_dimensions[key].hidden)


def copy_cells(source_sheet, target_sheet):
    for (row, col), source_cell in source_sheet._cells.items():
        target_cell = target_sheet.cell(column=col, row=row)

        target_cell._value = source_cell._value
        target_cell.data_type = source_cell.data_type

        if source_cell.has_style:
            target_cell.font = copy(source_cell.font)
            target_cell.border = copy(source_cell.border)
            target_cell.fill = copy(source_cell.fill)
            target_cell.number_format = copy(source_cell.number_format)
            target_cell.protection = copy(source_cell.protection)
            target_cell.alignment = copy(source_cell.alignment)

        if source_cell.hyperlink:
            target_cell._hyperlink = copy(source_cell.hyperlink)

        if source_cell.comment:
            target_cell.comment = copy(source_cell.comment)


wb_target = openpyxl.Workbook()
target_sheet = wb_target.create_sheet(..sheet_name..)

wb_source = openpyxl.load_workbook(..path\\+\\file_name.., data_only=True)
source_sheet = wb_source[..sheet_name..]

copy_sheet(source_sheet, target_sheet)

if 'Sheet' in wb_target.sheetnames:  # remove default sheet
    wb_target.remove(wb_target['Sheet'])

wb_target.save('out.xlsx')

Upvotes: 27

NamasteJasutin
NamasteJasutin

Reputation: 39

My workaround goes like this:

You have a template file let's say it's "template.xlsx". You open it, make changes to it as needed, save it as a new file, close the file. Repeat as needed. Just make sure to keep a copy of the original template while testing/messing around.

Upvotes: 4

floatingpurr
floatingpurr

Reputation: 8559

I've just found this question. A good workaround, as mentioned here, could consists in modifying the original wb in memory and then saving it with another name. For example:

import openpyxl

# your starting wb with 2 Sheets: Sheet1 and Sheet2
wb = openpyxl.load_workbook('old.xlsx')

sheets = wb.sheetnames # ['Sheet1', 'Sheet2']

for s in sheets:

    if s != 'Sheet2':
        sheet_name = wb.get_sheet_by_name(s)
        wb.remove_sheet(sheet_name)

# your final wb with just Sheet1
wb.save('new.xlsx')

Upvotes: 3

IDontKnowCode
IDontKnowCode

Reputation: 48

A workaround I use is saving the current sheet as a pandas data frame and loading it to the excel workbook you need

Upvotes: 0

Bandi Sandeep
Bandi Sandeep

Reputation: 161

I had a similar requirement to collate data from multiple workbooks into one workbook. As there are no inbuilt methods available in openpyxl.

I created the below script to do the job for me.

Note: In my usecase all worbooks contain data in same format.

from openpyxl import load_workbook
import os


# The below method is used to read data from an active worksheet and store it in memory.
def reader(file):
    global path
    abs_file = os.path.join(path, file)
    wb_sheet = load_workbook(abs_file).active
    rows = []
    # min_row is set to 2, to ignore the first row which contains the headers
    for row in wb_sheet.iter_rows(min_row=2):
        row_data = []
        for cell in row:
            row_data.append(cell.value)
        # custom column data I am adding, not needed for typical use cases
        row_data.append(file[17:-6])
        # Creating a list of lists, where each list contain a typical row's data
        rows.append(row_data)
    return rows


if __name__ == '__main__':
    # Folder in which my source excel sheets are present
    path = r'C:\Users\tom\Desktop\Qt'
    # To get the list of excel files
    files = os.listdir(path)
    for file in files:
        rows = reader(file)
        # below mentioned file name should be already created
        book = load_workbook('new.xlsx')
        sheet = book.active
        for row in rows:
            sheet.append(row)
        book.save('new.xlsx')

Upvotes: 3

Related Questions