Python to combine Excel spreadsheets

Question

Hello all…a question in using Panda to combine Excel spreadsheets.

The problem is that, sequence of columns are lost when they are combined. If there are more files to combine, the format will be even worse.

If gives an error message, if the number of files are big.

ValueError: column index (256) not an int in range(256)

What I am using is below:

import pandas as pd

df = pd.DataFrame()

for f in ['c:\1635.xls', 'c:\1644.xls']:
    data = pd.read_excel(f, 'Sheet1')
    data.index = [os.path.basename(f)] * len(data)
    df = df.append(data)

df.to_excel('c:\CB.xls')

The original files and combined look like: enter image description here

what's the best way to combine great amount of such similar Excel files?

thanks.

Sinux · Accepted Answer

I usually use xlrd and xlwt:

#!/usr/bin/env python
# encoding: utf-8

import xlwt
import xlrd
import os


current_file = xlwt.Workbook()
write_table = current_file.add_sheet('sheet1', cell_overwrite_ok=True)

key_list = [u'City', u'Country', u'Received Date', u'Shipping Date', u'Weight', u'1635']
for title_index, text in enumerate(key_list):
    write_table.write(0, title_index, text)


file_list = ['1635.xlsx', '1644.xlsx']

i = 1
for name in file_list:
    data = xlrd.open_workbook(name)

    table = data.sheets()[0]
    nrows = table.nrows
    for row in range(nrows):
        if row == 0:
            continue
        for index, context in enumerate(table.row_values(row)):
            write_table.write(i, index, context)
        i += 1


current_file.save(os.getcwd() + '/result.xls')

Python to combine Excel spreadsheets

Answers (2)

Related Questions