Reputation: 1829
I have a Python 3 script (using PRAW library) that executes once and ends. It currently is automated using cron jobs and runs every 45 minutes.
There is a need to change this to a persistence bot so the bot is always "online", so cron cannot be used. Part of this is easy enough:
def bot_loop():
running = True
while running:
try:
#This should occur constantly and monitor Reddit API stream
for submission in subreddit.stream.submissions(skip_existing=False):
print(submission.title)
#TODO: below condition to only execute the below every 45 minutes
health_checks()
except KeyboardInterrupt:
print('Keyboard Interrupt. Ending bot.')
running = False
except Exception as e:
print('Exception raised per below. Attempting to continue bot in 10 seconds.')
print(e)
time.sleep(10)
What would be the best logic in this loop to ensure the health checks only runs every 45 minutes? Meanwhile the rest of the script would continue and process. Additionally, what is the best way to also ensure that if for some reason it does not run on the 45th minute (say at xx:45:00
) such as perhaps CPU is busy elsewhere, it runs at the next opportunity?
The logic should be:
Considerations could be if minute == 45, but that alone has issues (it would run at least 60 times in the minute).
Upvotes: 0
Views: 1594
Reputation: 15619
There are multiple Python modules available to accomplish your required needs. Some of these modules include:
Modules:
Advanced Python Scheduler (APScheduler)
In 2019, I posted an answer for another question on using a scheduler in Python. Here is that question and my answer.
Concerning your question here are two ways to tackle the problem.
Timeloop Example:
import time
from datetime import timedelta
from timeloop import Timeloop
tl = Timeloop()
# @tl.job(interval=timedelta(minutes=45))
@tl.job(interval=timedelta(minutes=1))
def health_checks():
print('running a health check')
print("job current time : {}".format(time.ctime()))
def bot_loop():
# timeloop is designed to run on a separate thread
tl.start()
while True:
try:
print ('running bot')
time.sleep(10)
except KeyboardInterrupt:
print('Keyboard Interrupt. Ending bot.')
tl.stop()
except Exception as e:
print('Exception raised per below. Attempting to continue bot in 10 seconds.')
print(e)
time.sleep(10)
if __name__ == "__main__":
bot_loop()
Schedule Example:
import schedule
import time
def health_checks():
print('running a health check')
print("job current time : {}".format(time.ctime()))
def bot_loop():
while True:
try:
print ('running bot')
time.sleep(10)
except KeyboardInterrupt:
print('Keyboard Interrupt. Ending bot.')
schedule.CancelJob()
except Exception as e:
print('Exception raised per below. Attempting to continue bot in 10 seconds.')
print(e)
time.sleep(10)
if __name__ == "__main__":
# schedule.every(45).minutes.do(health_checks)
schedule.every(1).minutes.do(health_checks)
while True:
schedule.run_pending()
bot_loop()
Upvotes: 1
Reputation: 17247
The inner core of the program looks like:
for data in read_from_data_stream():
process(data)
where read_from_data_stream()
is provided by some 3rd party library and yields some kind of data as it arrives. In addition to the service above, a health_check()
function should be called every 45 minutes.
The problem is that we want to do two activities at the same time. This is not possible in a regular single threaded program. But is it really a problem?
We could do at least some health checking:
CHECKTIME = 45*60.0 # in seconds
last_check = time.monotonic()
for data in read_from_data_stream():
process(data)
now = time.monotonic()
if now > last_check + CHECKTIME:
health_check()
last_check = now
and it could be sufficient, even almost equivalent to the original specification, if data is coming often, say at least every few seconds or so.
But even if new data is not available for longer periods of time, it might be still acceptable. If there is no activity, no data processing, the health_check
could be unnecessary, because nothing has changed since the last one.
The code can be further improved if the read_from_data_stream()
offers a timeout option:
while True:
try:
for data in read_from_data_stream(timeout=CHECKTIME):
... for loop body as above ...
except DataTimeoutError:
... run extra health_check ...
continue
If the solution above is not good enough, there are two options. Use an async
version of the library, if available, or spawn a new thread:
The main thread runs the loop, the additional thread runs the health check:
while True:
time.sleep(CHECKTIME)
health_check()
but almost certainly the health_check()
and process(data)
are accessing the same internal data structures, making a mutex lock mandatory.
Upvotes: 1
Reputation: 3549
Two options come to mind:
import schedule
import time
def job():
print("I'm working...")
schedule.every(45).minutes.do(job)
schedule.every().hour.do(job)
schedule.every().day.at("10:30").do(job)
schedule.every(5).to(10).minutes.do(job)
schedule.every().monday.do(job)
schedule.every().wednesday.at("13:15").do(job)
schedule.every().minute.at(":17").do(job)
while True:
schedule.run_pending()
time.sleep(1)
Upvotes: 2
Reputation: 381
You need multi-threading for that purpose.
import threading
def main_code():
print("Doing health checks and other stuff...")
threading.Timer(45*60, main_code).start()
Call main_code at start of bot_loop. The code schedules itself to run 45 minutes later each time it is called.
Executing periodic actions in Python
What is the best way to repeatedly execute a function every x seconds?
Upvotes: 1
Reputation: 333
Try using celery. With celery, you can relaunch the subsequent task with eta=45 minutes when the task finishes.
PS. I am not writting the entire snippet, but just a skeleton. You can use max_retries etc for multiple attempts in case of failure
@task
def my_task(...):
....
my_task.delay(args=..., eta=45 * 60)
Upvotes: 3