En_g_neer
En_g_neer

Reputation: 107

Adding a pandas.DataFrame to Existing Excel File

I have a web scraper which creates an excel file for this month's scrapes. I want to add today's scrape and every scrape for that month into that file as a new sheet every time it is run. My issue, however, has been that it only overwrites the existing sheet with a new sheet instead of adding it as a separate new sheet. I've tried to do it with xlrd, xlwt, pandas, and openpyxl.

Still brand new to Python so simplicity is appreciated!

Below is just the code dealing with writing the excel file.

# My relevant time variables
ts = time.time()
date_time = datetime.datetime.fromtimestamp(ts).strftime('%y-%m-%d %H_%M_%S')
HourMinuteSecond = datetime.datetime.fromtimestamp(ts).strftime('%H_%M_%S')
month = datetime.datetime.now().strftime('%m-%y')

# Creates a writer for this month and year
writer = pd.ExcelWriter(
    'C:\\Users\\G\\Desktop\\KickstarterLinks(%s).xlsx' % (month), 
    engine='xlsxwriter')

# Creates dataframe from my data, d
df = pd.DataFrame(d)

# Writes to the excel file
df.to_excel(writer, sheet_name='%s' % (HourMinuteSecond))
writer.save()

Upvotes: 3

Views: 5813

Answers (1)

Stephen Rauch
Stephen Rauch

Reputation: 49842

Update:

This functionality has been added to pandas 0.24.0:

ExcelWriter now accepts mode as a keyword argument, enabling append to existing workbooks when using the openpyxl engine (GH3441)

Previous version:

Pandas has an open feature request for this.

In the mean time, here is a function which adds a pandas.DataFrame to an existing workbook:

Code:

def add_frame_to_workbook(filename, tabname, dataframe, timestamp):
    """
    Save a dataframe to a workbook tab with the filename and tabname
    coded to timestamp

    :param filename: filename to create, can use strptime formatting
    :param tabname: tabname to create, can use strptime formatting
    :param dataframe: dataframe to save to workbook
    :param timestamp: timestamp associated with dataframe
    :return: None
    """
    filename = timestamp.strftime(filename)
    sheet_name = timestamp.strftime(tabname)

    # create a writer for this month and year
    writer = pd.ExcelWriter(filename, engine='openpyxl')
    
    try:
        # try to open an existing workbook
        writer.book = load_workbook(filename)
        
        # copy existing sheets
        writer.sheets = dict(
            (ws.title, ws) for ws in writer.book.worksheets)
    except IOError:
        # file does not exist yet, we will create it
        pass

    # write out the new sheet
    dataframe.to_excel(writer, sheet_name=sheet_name)
    
    # save the workbook
    writer.save()

Test Code:

import datetime as dt
import pandas as pd
from openpyxl import load_workbook

data = [x.strip().split() for x in """
                   Date  Close
    2016-10-18T13:44:59  2128.00
    2016-10-18T13:59:59  2128.75
""".split('\n')[1:-1]]
df = pd.DataFrame(data=data[1:], columns=data[0])

name_template = './sample-%m-%y.xlsx'
tab_template = '%d_%H_%M'
now = dt.datetime.now()
in_an_hour = now + dt.timedelta(hours=1)
add_frame_to_workbook(name_template, tab_template, df, now)
add_frame_to_workbook(name_template, tab_template, df, in_an_hour)

(Source)

Upvotes: 4

Related Questions