Chris
Chris

Reputation: 3269

Streaming data from Postgres into Python

I'm looking for advice on efficient ways to stream data incrementally from a Postgres table into Python. I'm in the process of implementing an online learning algorithm and I want to read batches of training examples from the database table into memory to be processed. Any thoughts on good ways to maximize throughput? Thanks for your suggestions.

Upvotes: 14

Views: 15805

Answers (2)

Harvey
Harvey

Reputation: 637

You may want to look into the Postgres LISTEN/NOTIFY functionality https://www.postgresql.org/docs/9.1/static/sql-notify.html

Upvotes: 0

jjanes
jjanes

Reputation: 44202

If you are using psycopg2, then you will want to use a named cursor, otherwise it will try to read the entire query data into memory at once.

cursor = conn.cursor("some_unique_name")
cursor.execute("SELECT aid FROM pgbench_accounts")
for record in cursor:
    something(record)

This will fetch the records from the server in batches of 2000 (default value of itersize) and then parcel them out to the loop one at a time.

Upvotes: 23

Related Questions