Reputation: 759
I have one giant csv file that contains a year worth intraday data of a stock, as it is too big to analyse, I would like to process the data beforehand.
I want to break it down into 12 excel files (12 months), each one would have 28-31 sheets/tabs/pages that contains that intraday data for that date. Hence I have prepared these excel files with names Stock-01, Stock-02 until Stock-12, they are empty at the moment, waiting for data to append by my python code.
I am using the code below after I use read_csv:
for index in indexs:
name = "Stock_" + index.strftime("%y-%m-%d")
work = data[data.index.date == index].copy()
columnsTitles = ['Volume','Open','High','Low','Close']
work = work.reindex(columns= columnsTitles)
filepath = "Stock-{}.xlsx".format(index.strftime("%m"))
writer = ExcelWriter(filepath, engine='openpyxl')
work.to_excel(writer, index = True, sheet_name=name)
writer.save()
writer.close()
del work
My approach is, indexs contains all the dates, hence running through using index, i select the prepared excel file, append a sheet with the name using the date (index), and copy the dataframe directly using the simple to_excel function.
However, the result is, it has only one sheet, and this sheet has the last date data. (One possible reason may be the program is creating a new workbook everytime rather than reading the existing workbook)
A similar question is asked, but that is working on another approach. I would like to write a dataframe directly to excel, but in this question, using xlsxwriter, I don't seem to be able to do it, as it accepts str value only.
Upvotes: 1
Views: 2048
Reputation: 1173
The reason you have only one tab is because thru each iteration you create a new file called "Stock-{}.xlsx" with one tab in it. Depending on your code, you will either end up with multiple excel sheets, or one sheet that gets rewritten multiple times.
Edited after our discussion:
df = pd.read_csv("D:/AmiBackupeSignal/AMGN.txt", index_col="Date/Time")
df.index = pd.to_datetime(df.index)
# get list of all unique days/months
days = df.index.day.unique().sort_values()
months = df.index.month.unique().sort_values()
for month in months:
filepath = "Stock-{}.xlsx".format(month)
with pd.ExcelWriter(filepath) as writer:
for day in days:
name = "Stock_" + "-".join([str(df.index.year.unique()[0]), str(month), str(day)])
# get data for that month/day
# this will create 31 tab
temp = df.loc[(df.index.month == month) & (df.index.day == day)]
temp.to_excel(writer, index = True, sheet_name=name)
writer.save()
Upvotes: 1
Reputation: 352
What you can do is create blank excel file before for loop. Once that is excel book is there you can load it using
from openpyxl import load_workbook
book = load_workbook('CHANGE_THE_NAME.xlsx')
Then for each writer object in loop attach this book to it like this
writer = pandas.ExcelWriter('CHANGE_THE_NAME.xlsx', engine='openpyxl')
writer.book = book
After that, you can use to_excel
to write new sheets in same file. I hope this answers your question.
Upvotes: 0