Victor
Victor

Reputation: 759

Why Pandas is not creating a new sheet in excel?

I have one giant csv file that contains a year worth intraday data of a stock, as it is too big to analyse, I would like to process the data beforehand.

I want to break it down into 12 excel files (12 months), each one would have 28-31 sheets/tabs/pages that contains that intraday data for that date. Hence I have prepared these excel files with names Stock-01, Stock-02 until Stock-12, they are empty at the moment, waiting for data to append by my python code.

I am using the code below after I use read_csv:

for index in indexs:
    name = "Stock_" + index.strftime("%y-%m-%d")

    work = data[data.index.date == index].copy()
    columnsTitles = ['Volume','Open','High','Low','Close']
    work = work.reindex(columns= columnsTitles)

    filepath = "Stock-{}.xlsx".format(index.strftime("%m"))
    writer = ExcelWriter(filepath, engine='openpyxl')
    work.to_excel(writer, index = True, sheet_name=name)
    writer.save()
    writer.close()

    del work

My approach is, indexs contains all the dates, hence running through using index, i select the prepared excel file, append a sheet with the name using the date (index), and copy the dataframe directly using the simple to_excel function.

However, the result is, it has only one sheet, and this sheet has the last date data. (One possible reason may be the program is creating a new workbook everytime rather than reading the existing workbook)

A similar question is asked, but that is working on another approach. I would like to write a dataframe directly to excel, but in this question, using xlsxwriter, I don't seem to be able to do it, as it accepts str value only.

Upvotes: 1

Views: 2048

Answers (2)

Biarys
Biarys

Reputation: 1173

The reason you have only one tab is because thru each iteration you create a new file called "Stock-{}.xlsx" with one tab in it. Depending on your code, you will either end up with multiple excel sheets, or one sheet that gets rewritten multiple times.

Edited after our discussion:

df = pd.read_csv("D:/AmiBackupeSignal/AMGN.txt", index_col="Date/Time")

df.index = pd.to_datetime(df.index)

# get list of all unique days/months 
days = df.index.day.unique().sort_values()
months = df.index.month.unique().sort_values()

for month in months:
    filepath = "Stock-{}.xlsx".format(month)
    with pd.ExcelWriter(filepath) as writer:
        for day in days:
            name = "Stock_" + "-".join([str(df.index.year.unique()[0]), str(month), str(day)])

            # get data for that month/day
            # this will create 31 tab
            temp = df.loc[(df.index.month == month) & (df.index.day == day)]
            temp.to_excel(writer, index = True, sheet_name=name)

        writer.save()

Upvotes: 1

Aditya
Aditya

Reputation: 352

What you can do is create blank excel file before for loop. Once that is excel book is there you can load it using

from openpyxl import load_workbook

book = load_workbook('CHANGE_THE_NAME.xlsx')

Then for each writer object in loop attach this book to it like this

writer = pandas.ExcelWriter('CHANGE_THE_NAME.xlsx', engine='openpyxl') 
writer.book = book

After that, you can use to_excel to write new sheets in same file. I hope this answers your question.

Upvotes: 0

Related Questions