Reputation: 1163
I'm currently facing a problem where I am giving a thread a reference to a set and I want to be able to replace the set with a mocked database call. I have so far done
import logging
import threading
import time
from typing import Callable
from loguru import logger
class MonitorProduct:
def __init__(self, term: str, is_alive: Callable[[str], bool]) -> None:
self.is_alive = is_alive
self.term = term
def do_request(self) -> None:
time.sleep(.1)
while True:
logger.info(f'Checking {self.term}')
if not self.is_alive(self.term):
logger.info(f'Deleting term from monitoring: "{self.term}"')
return
time.sleep(5)
# mocked database
def database_terms() -> set[str]:
return {
'hello world',
'python 3',
'world',
'wth',
}
def database_terms_2() -> set[str]:
return {
'what am I doing wrong',
}
def main() -> None:
terms: set[str] = set()
while True:
db_terms = database_terms()
diff = db_terms - terms
terms.symmetric_difference_update(db_terms)
for url in diff:
logger.info(f'Starting URL: {url}')
threading.Thread(
target=MonitorProduct(url, terms.__contains__).do_request
).start()
time.sleep(2)
# ----------------------------------------------- #
db_terms = database_terms_2()
diff = db_terms - terms
terms.symmetric_difference_update(db_terms) # <--- terms should only now contain `what am I doing wrong`
# Start the new URLS
for url in diff:
logger.info(f'Starting URL 2: {url}')
threading.Thread(
target=MonitorProduct(url, terms.__contains__).do_request
).start()
time.sleep(10)
if __name__ == '__main__':
main()
The problem I am now having is that when we do our first db call, it should start threads for each of terms:
{
'hello world',
'python 3',
'world',
'wth',
}
and as you can see we also send in a terms.__contains__
for each thread.
When we do the second call of db - that set should replace the terms
to
{
'what am I doing wrong',
}
which should end up exiting the four running threads due to:
def do_request(self) -> None:
time.sleep(.1)
while True:
logger.info(f'Checking {self.term}')
if not self.is_alive(self.term):
logger.info(f'Deleting term from monitoring: "{self.term}"')
return
time.sleep(5)
however the problem is that we cannot replace terms by doing
terms = ... because we are creating a new set and then bidning that set to the variable terms while the thread still has a reference to the old set.
My question is, how can I replace the old terms with updating to the newest set without binding a new set?
Upvotes: 2
Views: 172
Reputation: 9153
You're almost there. But
diff = db_terms - terms
terms ^= diff # symmetric_difference_update()
Isn't enough, because that just adds the new values, so it's the same as
terms |= diff # update()
or even
terms |= db_terms # update()
(And one of these options should be clearer to the reader than the symmetric difference, because you're not using the symmetric difference to remove anything.)
To remove the old values, you want to also do
terms &= db_terms # intersection_update()
You said you're concerned about race conditions with intermediate values of the set. If you'd want to modify the set from more than one thread, you should use a mutex lock (threading.RLock
) around it. But if you're only modifying from one thread and comparing __contains__
in another, you can avoid a lock in CPython as long as each step of execution keeps your set in a consistent state.
Upvotes: 1