haojie
haojie

Reputation: 753

How to add Pandas dataframe to an existing xlsx file using to_excel

I have write some content to a xlsx file by using xlsxwriter

workbook = xlsxwriter.Workbook(file_name)
worksheet = workbook.add_worksheet()
worksheet.write(row, col, value)
worksheet.close()

I'd like to add a dataframe after the existing rows to this file by to_excel

 df.to_excel(file_name,
             startrow=len(existing_content),
             engine='xlsxwriter')

However, this seems not work.The dataframe not inserted to the file. Anyone knows why?

Upvotes: 1

Views: 886

Answers (2)

patrickjlong1
patrickjlong1

Reputation: 3823

The OP is using xlsxwriter in the engine parameter. Per XlsxWriter documentation "XlsxWriter is designed only as a file writer. It cannot read or modify an existing Excel file." (link to XlsxWriter Docs). Below I've provided a fully reproducible example of how you can go about modifying an existing .xlsx workbook using the openpyxl module (link to Openpyxl Docs).

For demonstration purposes, I'll first create create a workbook called test.xlsx using pandas:

import pandas as pd

df = pd.DataFrame({'Col_A': [1,2,3,4],
                   'Col_B': [5,6,7,8],
                   'Col_C': [0,0,0,0],
                   'Col_D': [13,14,15,16]})

df.to_excel('test.xlsx', index=False)

This is the Expected output at this point:

enter image description here

Using openpyxl you can use another dataset to load the existing workbook ('test.xlsx') and modify the third column with different data from the new dataframe while preserving the other existing data. In this example, for simplicity, I update it with a one column dataframe but you could extend it to update or add more data.

from openpyxl import load_workbook
import pandas as pd
df_new = pd.DataFrame({'Col_C': [9, 10, 11, 12]})
wb = load_workbook('test.xlsx')

ws = wb['Sheet1']

for index, row in df_new.iterrows():
    cell = 'C%d'  % (index + 2)
ws[cell] = row[0]

wb.save('test.xlsx')

With the Expected output at the end:

enter image description here

Upvotes: 0

yunwoong
yunwoong

Reputation: 35

Unfortunately, as the content above is not specifically written, let's take a look at to_excel and XlsxWriter as examples.

using xlsxwriter

import xlsxwriter

# Create a new Excel file and add a worksheet
workbook = xlsxwriter.Workbook('example.xlsx')
worksheet = workbook.add_worksheet()

# Add some data to the worksheet
worksheet.write('A1', 'Language')
worksheet.write('B1', 'Score')
worksheet.write('A2', 'Python')
worksheet.write('B2', 100)
worksheet.write('A3', 'Java')
worksheet.write('B3', 98)
worksheet.write('A4', 'Ruby')
worksheet.write('B4', 88)

# Save the file
workbook.close()

Using the above code, we have saved the table similar to the one below to an Excel file.

Language Score
Python 100
Java 98
Ruby 88

Next, if we want to add rows using a dataframe.to_excel :

using to_excel

import pandas as pd

# Load an existing Excel file
existing_file = pd.read_excel('example.xlsx')

# Create a new DataFrame to append
df = pd.DataFrame({
    'Language': ['C++', 'Javascript', 'C#'],
    'Score': [78, 97, 67]
})

# Append the new DataFrame to the existing file
result = pd.concat([existing_file, df])

# Write the combined DataFrame to the existing file
result.to_excel('example.xlsx', index=False)

The reason for using pandas concat:

  • To append, it is necessary to use pandas.DataFrame.ExcelWriter(), but XlsxWriter does not support append mode in ExcelWriter
  • Although the task can be accomplished using pandas.DataFrame.append(), the append method is slated to be deleted in the future, so we use concat instead.

Upvotes: 1

Related Questions