How do I get this loop to work correctly when writing pandas df to xlsx?

Question

I have used this code, which is kind of working. Right now in the smaller 'rep_list' as it executes the first rep in the list which is CP is adds it, but then when it goes to AM it overwrites the CP. SO right now when I run this code it only actually saves the last person in the loop. If I run the code with just "CP" and then just "AM" it appends it as it should. Is it something wrong with the for loop? or is it an issue with the workbook itself?

import pandas as pd
import datetime
from openpyxl import load_workbook

now = datetime.datetime.now()
currentDate = now.strftime("%Y-%m-%d")
call_report = pd.read_excel("Ending 2016-07-30.xlsx", "raw_data")

#rep_list = ["CP", "AM", "JB", "TT", "KE"]
rep_list = ["CP", "AM"]

def call_log_reader(rep_name):
    rep_log = currentDate + "-" + rep_name + ".csv"
    df = pd.read_csv(rep_log)
    df = df.drop(['From Name', 'From Number', 'To Name / Reference', 'To Number', 'Billing Code', 'Original Dialed Number',
     'First Hunt Group', 'Last Hunt Group'], axis=1)
    df['rep'] = rep_name

    book = load_workbook('Ending 2016-07-30.xlsx')
    writer = pd.ExcelWriter('Ending 2016-07-30.xlsx', engine='openpyxl')
    writer.book = book
    writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
    df.to_excel(writer, "raw_data", index=False)
    writer.save()
    ## I tried adding this : writer.close() hoping it would close the book and then force it to reopen for the next rep in the loop but it doesn't seem to work.

for rep in rep_list:
    call_log_reader(rep)

Thank you so much!

EDIT:

Gaurav Dhama gave a great answer that worked excellent. He pointed out that there is a bit of a limitation with the Pandas excelwriter (refer to this link) and proposed a solution in which each rep gets their own sheet in the end. This worked, however after I thought on it I opted against the additional sheets and came up with this solution knowing the limitation existed. Basically, I appended a CSV instead of the actual XLSX file, and then at the end opened that CSV and appended the one big list into the XLSX file. Either one works, just depends on what you're final product looks like.

import pandas as pd
import datetime
from openpyxl import load_workbook

now = datetime.datetime.now()
currentDate = now.strftime("%Y-%m-%d")
call_report = "Ending 2016-07-30.xlsx"
#rep_list = ["CP", "AM", "JB", "TT", "KE"]
rep_list = ["CP", "AM"]
csv_to_xl_files = []
merged_csv = currentDate + "-master.csv"

def call_log_reader(rep_name):
    rep_log = currentDate + "-" + rep_name + ".csv"
    df = pd.read_csv(rep_log)
    df = df.drop(['TimestampDetail', 'Billing Code', 'From Name', 'From Number', 'To Name / Reference', 'To Number',
              'Original Dialed Number', 'First Hunt Group', 'Last Hunt Group'], axis=1)
    df['rep'] = rep_name
    #print (df.head(3))
    df.to_csv(merged_csv, mode='a', index=False, header=False)
    csv_to_xl_files.append(rep_log)

book = load_workbook(call_report)
writer = pd.ExcelWriter(call_report, engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)

for rep in rep_list:
    call_log_reader(rep)

master_df = pd.read_csv(merged_csv)
master_df.to_excel(writer, "raw_data", index=False)
writer.save()

#this csv_to_xl_files list isn't finished yet, basically I'm going to use it to delete the files from the directory as I don't need them once the script is run.
print (csv_to_xl_files)

Gaurav Dhama · Accepted Answer

Try using the following:

import pandas as pd
import datetime
from openpyxl import load_workbook

now = datetime.datetime.now()
currentDate = now.strftime("%Y-%m-%d")
call_report = pd.read_excel("Ending 2016-07-30.xlsx", "raw_data")

#rep_list = ["CP", "AM", "JB", "TT", "KE"]
rep_list = ["CP", "AM"]

def call_log_reader(rep_name):
    rep_log = currentDate + "-" + rep_name + ".csv"
    df = pd.read_csv(rep_log)
    df = df.drop(['From Name', 'From Number', 'To Name / Reference', 'To Number', 'Billing Code', 'Original Dialed Number',
     'First Hunt Group', 'Last Hunt Group'], axis=1)
    df['rep'] = rep_name
    df.to_excel(writer, "raw_data"+rep, index=False)
    return df

book = load_workbook('Ending 2016-07-30.xlsx')
writer = pd.ExcelWriter('Ending 2016-07-30.xlsx', engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)

for rep in rep_list:
    call_log_reader(rep)

writer.save()

How do I get this loop to work correctly when writing pandas df to xlsx?

Answers (2)

Related Questions