Raja
Raja

Reputation: 6443

How to convert Pandas Dataframe to csv reader directly in python?

I have a csv file on with millions of rows. I used to create a dictionary out the csv file like this

 with open('us_db.csv', 'rb') as f:
    data = csv.reader(f)
    for row in data:
       Create Dictionary based on a column

Now to filter the rows based on some conditions I use pandas Dataframe as it is super fast in these operations. I load the csv as pandas Dataframe do some filtering. Then I want to continue doing the above. I thought of using pandas df.iterrows() or df.itertuples() but it is really slow.

Is there a way to convert the pandas dataframe to csv.reader() directly so that I can continue to use the above code. If I use csv_rows = to_csv(), it gives a long string. Ofcourse, I can write out a csv and then read from it again. But I want to know if there is a way to skip the extra read and write to a file.

Upvotes: 7

Views: 9683

Answers (2)

Qikai
Qikai

Reputation: 34

Why don't you apply the Create Dictionary function to the target column? Something like:

df['column_name'] = df['column_name'].apply(Create Dictionary)

Upvotes: 0

Bob Haffner
Bob Haffner

Reputation: 8483

You could do something like this..

import numpy as np
import pandas as pd
from io import StringIO
import csv

#random dataframe
df = pd.DataFrame(np.random.randn(3,4))

buffer = StringIO()  #creating an empty buffer
df.to_csv(buffer)  #filling that buffer
buffer.seek(0) #set to the start of the stream

for row in csv.reader(buffer):
    #do stuff

Upvotes: 14

Related Questions