BioProg
BioProg

Reputation: 153

Concatenating multiple .xls files with unknown number of columns

Using Python, I want to merge all .xls files in a directory into one data frame and save that as a new concatenated .xls file. The .xls files will have an unknown number of columns and no consistent headers.

I've used other suggestions on this forum and ended up with this:

import os
import pandas as pd

path = os.getcwd()
files = os.listdir(path)

files_xls = [f for f in files if f[-3:] == 'xls']

df = pd.DataFrame()

for f in files_xls:
    data = pd.read_excel(f for f in files_xls) # I dont understand what to add 
# in the parentheses here.
    df = df.append(data)
    df

I'm getting these errors:

File "<ipython-input-17-bb67a423cf40>", line 14, in <module>
  data = pd.read_excel(f for f in files_xls)

File "C:\Users\xxxx\Anaconda2\lib\site-packages\pandas\io\excel.py", line 170, in read_excel
  io = ExcelFile(io, engine=engine)

File "C:\Users\xxxx\Anaconda2\lib\site-packages\pandas\io\excel.py", line 229, in __init__
  raise ValueError('Must explicitly set engine if not passing in'

ValueError: Must explicitly set engine if not passing in buffer or path for io.

Upvotes: 0

Views: 113

Answers (1)

ℕʘʘḆḽḘ
ℕʘʘḆḽḘ

Reputation: 19395

try this brother

df = []

for f in files_xls:
    data = pd.read_excel(f) 
    df = df.append(data)

mydf = pd.concat(df, axis = 0)

Upvotes: 1

Related Questions