Executing more than one SQL query using psycopg2 in a double with statement

Is it possible to pass more than one query in a double with cursor opening statement with psycopg2 (first to open connection, then cursor)?

E.g. to replace:

import psycopg2

def connector():
    return psycopg2.connect(**DB_DICT_PARAMS)

########

sql_update1 = ("UPDATE table SET array = %s::varchar[], "
               "array_created = true, timestamp = now() AT TIME ZONE 'UTC' "
               "WHERE id = %s")

sql_update2 = ("UPDATE table SET json_field = %s "
               "WHERE id = %s")

with connector() as conn:
    with conn.cursor() as curs:
        curs.execute(sql_update1, [stringArray, ID])

with connector() as conn:
    with conn.cursor() as curs:
        curs.execute(sql_update2, [jsonString, ID])

by:

#(...)

sql_update1 = ("UPDATE table SET array = %s::varchar[], "
               "array_created = true, timestamp = now() AT TIME ZONE 'UTC' "
               "WHERE id = %s")

sql_update2 = ("UPDATE table SET json_field = %s "
               "WHERE id = %s")

with connector() as conn:
    with conn.cursor() as curs:
        curs.execute(sql_update1, [stringArray, ID])
        curs.execute(sql_update2, [jsonString, ID])

What if the second query needs the first one to be completed before, and what if not?
In the shown case, they will definitely update the same record (i.e. row) in the database but not the same fields (i.e. attributes or columns).

Is this precisely authorized because the two SQL statement are committed sequentially, i.e. the first finishes first. Then, after and only after, the second is executed.?

Or is it actually forbidden because they can be executed in parallel, each query without knowing the state of the other at any instant t?

There are no fancy triggers or procedures in the DB. Let's make it first simple.
(Please note that I have purposefully written two queries here, where a single one would have perfectly fit, but it's not always the case, as some computations are still in the way before saving some other results to the same record in the DB).

Upvotes: 5

Answers (2)

bobflux

Reputation: 11581

Better avoid this:

with connector() as conn:
    with conn.cursor() as curs:
        curs.execute(sql_update1, [stringArray, ID])

with connector() as conn:
    with conn.cursor() as curs:
        curs.execute(sql_update2, [jsonString, ID])

Opening a database connection is pretty slow compared to doing a query, so it is much better to reuse it rather than opening a new one for each query. If your program is a script, typically you'd just open the connection at startup and close it at exit.

However, if your program spends a long time waiting between queries, and there will be many instances running, then it would be better to close the connection to not consume valuable RAM on the postgres server for doing nothing. This is common in client/server applications where the client mostly waits for user input. If there are many clients you can also use connection pooling, which offers the best of both worlds at the cost of a bit extra complexity. But if it's just a script, no need to bother with that.

with connector() as conn:
    with conn.cursor() as curs:
        curs.execute(sql_update1, [stringArray, ID])
        curs.execute(sql_update2, [jsonString, ID])

This would be faster. You don't need to build a new cursor, you can reuse the same one. note if you don't fetch the results of the first query before reusing the cursor, then you won't be able to do so after executing the second query, because a cursor only stores the results of the last query. Since these are updates, there are no results, unless you want to check the rowcount to see if it did update a row.

What if the second query needs the first one to be completed before, and what if not?

Don't care. execute() processes the whole query before returning, so by the time python gets to the next bit of code, the query is done.

Is this precisely authorized because the two SQL statement are committed sequentially, i.e. the first finishes first. Then, after and only after, the second is executed.?

Yes they are executed in the order you give them, one after the other.

Note the word "committed" is not ideal. If you have turned on autocommit, then yes it will commit after each query, ending the transaction. This is not ideal because you want to be in control of your transactions: if you're going to insert several rows that depend on each other, it's often better to either insert them all, or not at all. In addition, commits require making sure the data is securely written to storage, which is slow. Grouping inserts or updates inside a transaction is much faster.

Regarding error handling, if one of the statements throws an exception (for example a foreign key violation) then the whole transaction is thrown away and postgres will not accept new commands until your code acknowledges it's handling the error by sending a rollback. So you can put all your queries within a try block, and in the except clause issue a rollback among other error handling code. This way either everything will be committed, or it will all be rolled back, and your database stays clean.

Or is it actually forbidden because they can be executed in parallel, each query without knowing the state of the other at any instant t?

If you want to execute several queries in parallel, for example because a query takes a while and you want to execute it while still running other queries, then you need several DB connections and of course one python thread for each because execute() is blocking. It's not used often.

Upvotes: 8

ddg

Reputation: 1098

If you want them to execute at the same time, simply put them in the same string seperated by a semicolon. I'm a little rusty but I think the following should work:

sql_updates = ("UPDATE table SET array = %s::varchar[], "
               "array_created = true, timestamp = now() AT TIME ZONE 'UTC' "
               "WHERE id = %s;"
               "UPDATE table SET json_field = %s "
               "WHERE id = %s;")

with connector() as conn:
    with conn.cursor() as curs:
        curs.execute(sql_updates, [stringArray, ID, jsonString, ID])

Upvotes: 5

Executing more than one SQL query using psycopg2 in a double with statement

Answers (2)

Related Questions