Convert dataframe into pandas.io.parsers.TextFileReader object (chunks)

Question

I've seen quite a few questions on how to segment a dataframe into various chunks. What I want is to know how to convert a dataframe into exactly the same object that you get when loading a csv file to a dataframe with the chunksize parameter i.e.

df = pd.read_csv(file_path, chunksize=1e5)
type(df)
>> pandas.io.parsers.TextFileReader

I want to recreate an identical TextFileReader object from a dataframe containing the dataframe data in various chunks. Any ideas on how to do this?

RomanPerekhrest · Accepted Answer

With text stream object StringIO and pd.read_csv function:

(df below contains a sample dataframe)

In [216]: df
Out[216]: 
     Date  Name  Wage
0  5/1/19   Joe  $100
1  5/1/19   Sam  $120
2  5/1/19  Kate   $30
3  5/2/19   Joe  $120
4  5/2/19   Sam  $134
5  5/2/19  Kate   $56
6  5/3/19   Joe   $89
7  5/3/19   Sam   $90
8  5/3/19  Kate  $231

In [217]: from pandas.compat import StringIO

In [218]: reader = pd.read_csv(StringIO(df.to_csv()), iterator=True)

In [219]: type(reader)
Out[219]: pandas.io.parsers.TextFileReader

In [220]: reader.get_chunk(3)
Out[220]: 
   Unnamed: 0    Date  Name  Wage
0           0  5/1/19   Joe  $100
1           1  5/1/19   Sam  $120
2           2  5/1/19  Kate   $30

Of course, you may specify a concrete chunk size via chunksize option.

iterator : boolean, default False
Return TextFileReader object for iteration or getting chunks with get_chunk().
chunksize : int, default None
Return TextFileReader object for iteration.

http://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#io-chunking

Convert dataframe into pandas.io.parsers.TextFileReader object (chunks)

Answers (1)

Related Questions