Santosh Pillai
Santosh Pillai

Reputation: 1391

Memory issues Converting large CSV file to excel using Python

I am trying to convert a large ~ 100mb csv file to xlsx using python. I tried the solution below, but ran into memory issues

https://stackoverflow.com/a/62750402/1266723

Solution below solves the memory issue by writing in chunks, but over-writes the files. Can someone please recommend how to append to the same file instead of creating new files or overwriting the same file?

https://stackoverflow.com/a/71237366/1266723

import pandas as pd

n = 1000  # number of rows per chunk
df = pd.read_csv("myFile.csv")

for i in range(0, df.shape[0], n):
    df[i:i+n].to_excel(f"new_file.xlsx", index=False, header=False)

Upvotes: 0

Views: 807

Answers (2)

Serge Ballesta
Serge Ballesta

Reputation: 149095

You can use an ExcelWriter to do multiple writes to an xlsx file.

You could even read the csv file in chunks:

n = 1000  # chunksize
with pd.read_csv("myFile.csv", chunksize=n) as reader, \
     pd.ExcelWriter('new_file.xlsx') as writer:
    for i, df in enumerate(reader):
        df.to_excel(writer, index=False, header=False, startrow=i * n)

The above works in recent pandas version, but read_csv only returns a context manager since version 1.2 so if you are using a too old version, you should use a try: ...: finaly: ... bloc instead:

n = 1000  # chunksize
writer = pd.ExcelWriter('new_file.xlsx')
try:
    with pd.read_csv("myFile.csv", chunksize=n) as reader:
        for i, df in enumerate(reader):
            df.to_excel(writer, index=False, header=False, startrow=i * n)
finally:
    writer.close()

Upvotes: 1

tturbo
tturbo

Reputation: 1166

Have you tried this (see second last example at the botton of the page)?

# ExcelWriter can also be used to append to an existing Excel file:

with pd.ExcelWriter('output.xlsx',
                    mode='a') as writer:  
    df.to_excel(writer, sheet_name='Sheet_name_3')

Upvotes: 0

Related Questions