Reputation: 523
I am trying to figure out the difference between SimpleConnectionPool
and ThreadedConnectionPool
in psycopg2 connection pool.
The doc says:
SimpleConnectionPool
connections can only be used inside a single threaded application/script.
ThreadedConnectionPool
connections can be safely used inside multi-threaded app/script.
What does safely
mean here?
My understanding/confusion:
"""
eg1: Simple Connection Pooling example
"""
from psycopg2.pool
from concurrent.futures
def someTask(id):
# CRUD queries to Postgres, that I will be multithreading
print(f"Thread: {id}")
conn = simple_pool.getconn()
# do DB operation
simple_pool = psycopg2.pool.SimpleConnectionPool(10, 15, #DB Info)
with concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor:
executor.map(someTask, range(1,10))
"""
eg2: Threaded Connection Pooling example
"""
from psycopg2.pool
from concurrent.futures
def someTask(id):
# CRUD queries to Postgres, that I will be multithreading
print(f"Thread: {id}")
conn = threaded_pool.getconn()
# do DB operation
threaded_pool = psycopg2.pool.ThreadedConnectionPool(10, 15, #DB Info)
with concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor:
executor.map(someTask, range(1,10))
Q1: I may be understanding this incorrectly, but in eg1 the someTask()
function will be called per thread, so if its simple connection pool, this will error out/will be UNSAFE (what does this mean?).
Q2: And in eg2, if the example is fine, what THREAD SAFE means, the someTask()
function will be allowed to get a connection out of the pool and in eg1 it won't?
Q3: Is there any performance difference between the two?
Any additional resources/articles/texts I can read to understand this better, is much appreciated. Thank you.
Upvotes: 9
Views: 6374
Reputation: 1631
Q1: I may be understanding this incorrectly, but in eg1 the someTask() function will be called per thread, so if its simple connection pool, this will error out/will be UNSAFE (what does this mean?).
SimpleConnectionPool
with ThreadPoolExecutor
is not a good idea, as ThreadPoolExecutor
will be a multi-threaded executor calling your someTask
Q2: And in eg2, if the example is fine, what THREAD SAFE means, the someTask() function will be allowed to get a connection out of the pool and in eg1 it won't?
Essentially it means, each of your workers, will know how to play nice with each other, with regards to the connections pool.
Worker A, won't start using a connection in the pool, only to be interrupted by Worker B
Workers A,B,C... will interface with the connections pool, in an orderly and "friendly" manor, waiting their turn if they need to.
Q3: Is there any performance difference between the two?
Both solutions, if implemented correctly would be performant, the only difference between a single threaded and multi-threaded approach is time, and the resources consumed at any given moment of time.
I hope that helps clear things up, also check out these posts on the subject, here and here
Upvotes: 1
Reputation: 146
According to the documentation of SimpleConnectionPool
, it is defined as:
A connection pool that can’t be shared across different threads
Which confirms what you said in your first question. Even if it runs without errors, using a SimpleConnectionPool
concurrently in multiple threads might result in undefined behaviour/wrong results due to race conditions between threads.
As for your second question, thread safety means that an object can be used concurrently by multiple threads without any need to handle race conditions. You can see that's the case if you follow the implementation of ThreadedConnectionPool
. The use locks to ensure no connection is shared by two threads at the same time.
I cannot comment on difference in performance between the two as they have different use cases.
Upvotes: 6