jgmh
jgmh

Reputation: 689

Threading inside a cursor in psycopg2

I wrote a script in python that connects to a database using psycopg. I'm working with two tables in the same database. First I run a select * from table_1 and after that I check if each row (fetchone) meets some requirements. If they do, a new row in table_2 is created with an INSERT. The code looks similar to this:

cursor_1 = conn.cursor()
cursor_2 = conn.cursor()

cursor_1.execute("SELECT * FROM table_1")
for i in range(1,n):
    a = cursor_1.fetchone()
    if (condition_in_a):
        cursor_2.execute("INSERT into table_2 (f1,f2) values (v1,v2)")

The script runs in a sequential for loop and I wanted to know if there's a better way to accomplish this task through psycopg2.pool in terms of performance. So my question is: Is it possible to multithread a psycopg cursor? If not, what would be the best approach to achieve a better performance?

Thanks in advance.

Upvotes: 2

Views: 703

Answers (1)

Oleh Rybalchenko
Oleh Rybalchenko

Reputation: 8039

In a single-thread approach, you are trying to keep the balance between memory usage and rows processing speed. fetchone decrease memory usage but require more work to fetch the data. fetchall for huge datasets use large memory amounts and data loading may take some time before processing started.

So, at least, a better way is to fetch data by smaller portions with fetchmany. The following generator may simplify a process:

def ResultIter(cursor, limit=100):
    while True:
        results = cursor.fetchmany(limit)
        if not results:
            break
        for result in results:
            yield result

Based on this approach you can implement multithreading processing as well, but I'm really not sure about performance.

Upvotes: 1

Related Questions