How to convert Pandas Dataframe to csv reader directly in python?

Question

I have a csv file on with millions of rows. I used to create a dictionary out the csv file like this

 with open('us_db.csv', 'rb') as f:
    data = csv.reader(f)
    for row in data:
       Create Dictionary based on a column

Now to filter the rows based on some conditions I use pandas Dataframe as it is super fast in these operations. I load the csv as pandas Dataframe do some filtering. Then I want to continue doing the above. I thought of using pandas df.iterrows() or df.itertuples() but it is really slow.

Is there a way to convert the pandas dataframe to csv.reader() directly so that I can continue to use the above code. If I use csv_rows = to_csv(), it gives a long string. Ofcourse, I can write out a csv and then read from it again. But I want to know if there is a way to skip the extra read and write to a file.

Bob Haffner · Accepted Answer

You could do something like this..

import numpy as np
import pandas as pd
from io import StringIO
import csv

#random dataframe
df = pd.DataFrame(np.random.randn(3,4))

buffer = StringIO()  #creating an empty buffer
df.to_csv(buffer)  #filling that buffer
buffer.seek(0) #set to the start of the stream

for row in csv.reader(buffer):
    #do stuff

How to convert Pandas Dataframe to csv reader directly in python?

Answers (2)

Related Questions