Jeppe Lytter
Jeppe Lytter

Reputation: 41

Python Pandas ExcelWriter append to sheet creates a new sheet

I would I really appreciate some help.

I'm trying to use a loop to create sheets, and add data to those sheets for every loop. The position of my data is correct, however Panda ExcelWriter creates a new sheet instead of appending to the one created the first time the loop runs.

I'm a beginner, and right function is over form, so forgive me.

My code:

import pandas as pd


# initial files for dataframes
excel_file = 'output.xlsx'
setup_file = 'setup.xlsx'

# write to excel
output_filename = 'output_final.xlsx'




df = pd.read_excel(excel_file)  # create dataframe of entire sheet
df.columns = df.columns.str.strip().str.lower().str.replace(' ', '_').str.replace('(', '').str.replace(')',
                                                                                                       '')  # clean dataframe titles

df_setup = pd.read_excel(setup_file)
df_setup.columns = df_setup.columns.str.strip().str.lower().str.replace(' ', '_').str.replace('(', '').str.replace(')',
                                                                                                                   '')  # clean dataframe titles

df_2 = pd.merge(df, df_setup)  # Merge data with setup to have krymp size for each wire in dataframe

df_2['wirelabel'] = "'" + df_2['cable'] + "_" + df_2['function_code'] + "-" + df_2['terminal_strip'] + ":" + df_2[
    'terminal']  # creates column for the wirelabel by appending columns with set delimiters. #TODO: delimiters to be by inputs.

df_2.sort_values(by=['switchboard'])  # sort so we get proper order

switchboard_unique = df.switchboard.unique().tolist()  # crate variable containing unique switchboards for printing to excel sheets


def createsheets(output_filename, sheetname, row_start, column_start, df_towrite):
    with pd.ExcelWriter(output_filename, engine='openpyxl', mode='a') as writer:
        df_towrite.to_excel(writer, sheet_name=sheetname, columns=['wirelabel'], startrow=row_start, startcol=column_start, index=False, header=False)
        writer.save()
        writer.close()

def sorter():
    for s in switchboard_unique:
        df_3 = df_2.loc[df_2['switchboard'] == s]
        krymp_unique = df_3.krymp.unique().tolist()
        krymp_unique.sort()
        # print(krymp_unique)
        column_start = 0
        row_start = 0
        for k in krymp_unique:
            df_3.loc[df_3['krymp'] == k]
            # print(k)
            # print(s)
            # print(df_3['wirelabel'])
            createsheets(output_filename, s, row_start, column_start, df_3)
            column_start = column_start + 1

sorter()

current behavior:

if sheetname is = sheet, then my script creates sheet1, sheet2, sheet3..etc.

pictureofcurrent

Wanted behavior

Create a sheet for each item in "df_3", and put data into columns according to the position calculated in column_start. The position in my code works, just goes to the wrong sheet.

pictureofwanted

I hope it's clear what im trying to accomplish, and all help is appriciated. I tried all example codes i have sound regarding writing to excel.

Upvotes: 2

Views: 1822

Answers (1)

Jeppe Lytter
Jeppe Lytter

Reputation: 41

I know my code is not a work of art, but I will update this post with the answer to my own question for the sake of completeness, and if anyone stumbles on this post.

It turns out i misunderstood the capabilities of the "append" function in Pandas "pd.ExcelWriter". It is not possible to append to a sheet already existing, the sheet will get overwritten though mode is set to 'a'.

Realizing this i changed my code to build a dataframe for the entire sheet (df_sheet), an then call the "createsheets" function in my code. The first version wrote my data column by column.

"Final" code:

import pandas as pd

# initial files for dataframes
excel_file = 'output.xlsx'
setup_file = 'setup.xlsx'

# write to excel
output_filename = 'output_final.xlsx'
column_name = 0

df = pd.read_excel(excel_file)  # create dataframe of entire sheet
df.columns = df.columns.str.strip().str.lower().str.replace(' ', '_').str.replace('(', '').str.replace(')',
                                                                                                       '')  # clean dataframe titles

df_setup = pd.read_excel(setup_file)
df_setup.columns = df_setup.columns.str.strip().str.lower().str.replace(' ', '_').str.replace('(', '').str.replace(')',
                                                                                                                   '')  # clean dataframe titles

df_2 = pd.merge(df, df_setup)  # Merge data with setup to have krymp size for each wire in dataframe

df_2['wirelabel'] = "'" + df_2['cable'] + "_" + df_2['function_code'] + "-" + df_2['terminal_strip'] + ":" + df_2[
    'terminal']  # creates column for the wirelabel by appending columns with set delimiters. #TODO: delimiters to be by inputs.

df_2.sort_values(by=['switchboard'])  # sort so we get proper order

switchboard_unique = df.switchboard.unique().tolist()  # crate variable containing unique switchboards for printing to excel sheets




def createsheets(output_filename, sheetname, df_towrite):
    with pd.ExcelWriter(output_filename, engine='openpyxl', mode='a') as writer:
        df_towrite.to_excel(writer, sheet_name=sheetname, index=False, header=True)


def to_csv_file(output_filename, df_towrite):
    df_towrite.to_csv(output_filename, mode='w', index=False)

def sorter():
    for s in switchboard_unique:
        df_3 = df_2.loc[df_2['switchboard'] == s]

        krymp_unique = df_3.krymp.unique().tolist()
        krymp_unique.sort()

        column_start = 0
        row_start = 0
        df_sheet = pd.DataFrame([])

        for k in krymp_unique:
            df_5 = df_3.loc[df_3['krymp'] == k]
            df_4 = df_5.filter(['wirelabel'])


            column_name = "krymp " + str(k) + " Tavle: " + str(s)

            df_4 = df_4.rename(columns={"wirelabel": column_name})
            df_4 = df_4.reset_index(drop=True)
            df_sheet = pd.concat([df_sheet, df_4], axis=1)

            column_start = column_start + 1
            row_start = row_start + len(df_5.index) + 1

        createsheets(output_filename, s, df_sheet)
        to_csv_file(s + ".csv", df_sheet)


sorter()


Thank you.

Upvotes: 2

Related Questions