pk10
pk10

Reputation: 523

SimpleConnectionPool vs ThreadedConnectionPool : what it means to be thread safe?

I am trying to figure out the difference between SimpleConnectionPool and ThreadedConnectionPool in psycopg2 connection pool.

The doc says:
SimpleConnectionPool connections can only be used inside a single threaded application/script.
ThreadedConnectionPool connections can be safely used inside multi-threaded app/script.

What does safely mean here?

My understanding/confusion:


"""
eg1: Simple Connection Pooling example
"""

from psycopg2.pool
from concurrent.futures

def someTask(id):
  # CRUD queries to Postgres, that I will be multithreading
  print(f"Thread: {id}")
  conn = simple_pool.getconn()
  # do DB operation


simple_pool = psycopg2.pool.SimpleConnectionPool(10, 15, #DB Info)

with concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor:
  executor.map(someTask, range(1,10))


"""
eg2: Threaded Connection Pooling example
"""

from psycopg2.pool
from concurrent.futures

def someTask(id):
  # CRUD queries to Postgres, that I will be multithreading
  print(f"Thread: {id}")
  conn = threaded_pool.getconn()
  # do DB operation


threaded_pool = psycopg2.pool.ThreadedConnectionPool(10, 15, #DB Info)

with concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor:
  executor.map(someTask, range(1,10))

Q1: I may be understanding this incorrectly, but in eg1 the someTask() function will be called per thread, so if its simple connection pool, this will error out/will be UNSAFE (what does this mean?).

Q2: And in eg2, if the example is fine, what THREAD SAFE means, the someTask() function will be allowed to get a connection out of the pool and in eg1 it won't?

Q3: Is there any performance difference between the two?

Any additional resources/articles/texts I can read to understand this better, is much appreciated. Thank you.

Upvotes: 9

Views: 6374

Answers (2)

Open AI - Opting Out
Open AI - Opting Out

Reputation: 1631

Q1: I may be understanding this incorrectly, but in eg1 the someTask() function will be called per thread, so if its simple connection pool, this will error out/will be UNSAFE (what does this mean?).

  • Yes, someTask will be called in each worker, and using a SimpleConnectionPool with ThreadPoolExecutor is not a good idea, as ThreadPoolExecutor will be a multi-threaded executor calling your someTask
  • So eg1, is definitely THREAD UNSAFE, as it employs threads, as a major aspect of it's design.

Q2: And in eg2, if the example is fine, what THREAD SAFE means, the someTask() function will be allowed to get a connection out of the pool and in eg1 it won't?

  • Essentially it means, each of your workers, will know how to play nice with each other, with regards to the connections pool.

  • Worker A, won't start using a connection in the pool, only to be interrupted by Worker B

  • Workers A,B,C... will interface with the connections pool, in an orderly and "friendly" manor, waiting their turn if they need to.

Q3: Is there any performance difference between the two?

  • Yes, based on what you are looking for, eg1 will definitely not work.
  • Eg2, utilizes threading with the thread safe connection pool, and it's workers can work simultaneously, Where single threaded solution, would take more time.

Both solutions, if implemented correctly would be performant, the only difference between a single threaded and multi-threaded approach is time, and the resources consumed at any given moment of time.

I hope that helps clear things up, also check out these posts on the subject, here and here

Upvotes: 1

Khaled Barie
Khaled Barie

Reputation: 146

According to the documentation of SimpleConnectionPool, it is defined as:

A connection pool that can’t be shared across different threads

Which confirms what you said in your first question. Even if it runs without errors, using a SimpleConnectionPool concurrently in multiple threads might result in undefined behaviour/wrong results due to race conditions between threads.

As for your second question, thread safety means that an object can be used concurrently by multiple threads without any need to handle race conditions. You can see that's the case if you follow the implementation of ThreadedConnectionPool. The use locks to ensure no connection is shared by two threads at the same time.

I cannot comment on difference in performance between the two as they have different use cases.

Upvotes: 6

Related Questions