Reputation: 9358
I am combining Excel spreadsheets into 1 spreadsheet, and want to insert the filenames of the spreadsheets as first column when they are combined.
The Excel files to be combined are: 20170801.xlsx, 20170901.xlsx, 20171001.xlsx
This is the code I have:
working_folder = 'c:\\abc\\'
files = os.listdir(working_folder)
files_xls = [];
for f in files:
if f.endswith(".xlsx"):
fff = working_folder + f
files_xls.append(fff)
df = pd.DataFrame();
for f in files_xls:
data = pd.read_excel(f, skip_footer=0)
file_date = basename(f).replace(".xlsx", "")
#data.index = pd.MultiIndex.from_tuples([("", file_date)] * len(data))
data.index = pd.MultiIndex.from_tuples([(file_date)] * len(data))
df = df.append(data)
df.to_excel(working_folder + 'final.xls', merge_cells = False)
The problem lies in the line data.index = pd.MultiIndex….
When I use:
data.index = pd.MultiIndex.from_tuples([(file_date)] * len(data))
it produce a spreadhsheet like this:
When I use:
data.index = pd.MultiIndex.from_tuples([("", file_date)] * len(data))
It produce a better looking spreadsheet but an empty column A like this:
Not to use MultiIndex, how can I have a final spreadsheet looked like this?
Thank you.
Upvotes: 2
Views: 867
Reputation: 863431
I believe you dont need MultiIndex
, only change:
data.index = pd.MultiIndex.from_tuples([(file_date)] * len(data))
to:
data.index = [file_date] * len(data)
Upvotes: 5