Writing output of a large dataset in excel in Python Pandas

Question

I have the following code snippet that helps me write my output in an excel file. This piece of code works fine if the dataset is not that big(for example, upto 100K records).

But this code doesnot work very well if my dataset is pretty big such as 10 million records.

So, in such case how can I Split the dataset into several smaller datasets and at the same time upload them all in one file. The target is to get all the data in one file.

import time


class WriteToExcel(object):

    def generate_report(self, output_data):
        timestr = time.strftime("%Y%m%d-%H%M%S")
        file_path = '/some/path/'
        file_name = 'file' + '_' + timestr + '.xlsx'

        try:
                        output_data.to_excel(file_path+file_name, index=False)
                        print("File generated named " + file_name + " at the location " + file_path)


        except IOError:
               print("Opps, I think file path is incorrect.")



## how to invoke the method
nis = WriteToExcel()
nis.generate_report(some_data)

gyoza · Accepted Answer

I just re-write the solution I wrote in comment here just as a record.

output_data.to_csv(file_path+file_name, index=False)

As a side note, if you want to designate dtypes of each column when you re-read the csv from where you saved it, manually set dtype argument when you use pd.read_csv. Below is short example:

import pandas as pd

df = pd.DataFrame().assign(a=range(3), b=list('abc'))
df.a = df.a.astype(str)
df.to_csv(filepath, index=False)

# str --> int conversion happens here, so manually prevent it if you want
df2 = pd.read_csv(filepath, dtype={'a': str})

>>> df2.dtypes
a    object
b    object
dtype: object

pandas infers dtype for each column when they are not specified, so if you need to keep str format when all the values look int from pandas, you need to manually set it when it is read.

Hope this helps.

Writing output of a large dataset in excel in Python Pandas

Answers (1)

Related Questions