Psycopg - Memory error when selecting a large dataset from PostgreSQL

Question

So I have a table with 146 columns and approx. 8 mil rows of sparse data stored locally into a Postgresql.

My goal is to select the whole dataset at once, store it into a pandas dataframe and perform some calculations.

So far I have read about server side cursors in many threads but i guess I'm doing something wrong as I don't see improvement in memory. The documentation is also quite limited..

My code so far is the following:

cur=conn.cursor('testCursor')
cur.itersize = 100000
cur.execute("select * from events")

df = cur.fetchall()

df = pd.DataFrame(df)
conn.commit()
conn.close()

I also tried using fetchmany() or fetchone() instead of fetchall() but I don't know how to scroll the results. I guess I could use something like this for fetchone() but I don't know how to handle fetchmany():

df = cur.fetchone()
while row:
   row = cur.fetchone()

Lastly, in case of fetchone() and fetchmany() how can I concat the results into a single dataframe without consuming all of my memory? Just to note that I have 16gb available RAM

Psycopg - Memory error when selecting a large dataset from PostgreSQL

Answers (1)

Related Questions