psycopg2 process cursor results with muliple threads or processes

Question

I have a function that queries a large table for the purposes of indexing it... It creates a server-side cursor named "all_accounts".

def get_all_accounts(self):
    cursor = self.get_cursor('all_accounts')
    cursor.execute("SELECT * FROM account_summary LIMIT 20000;")

I then process those 2,000 or so at a time to insert into a NoSQL solution:

def index_docs(self, cursor):
  while True:
    # consume result over a series of iterations
    # with each iteration fetching 2000 records
    record_count = cursor.rowcount
    records = cursor.fetchmany(size=2000)

    if not records:
        break

    for r in records:
        # do stuff

I'd like the index_docs function to be consuming the cursor fetchmany() calls in parallel x10 as my bottleneck is not caused by the target system, but rather the single threaded nature of my script. I have done a few async/worker things in the past, but the psycopg2 cursor seemed like it might be an issue. Thoughts?

psycopg2 process cursor results with muliple threads or processes

Answers (1)

Related Questions