Zeno
Zeno

Reputation: 1829

Execute logic every X minutes (without cron)?

I have a Python 3 script (using PRAW library) that executes once and ends. It currently is automated using cron jobs and runs every 45 minutes.

There is a need to change this to a persistence bot so the bot is always "online", so cron cannot be used. Part of this is easy enough:

def bot_loop():
   running = True
   while running:
      try:
         #This should occur constantly and monitor Reddit API stream
         for submission in subreddit.stream.submissions(skip_existing=False):
            print(submission.title)
         #TODO: below condition to  only execute the below every 45 minutes
            health_checks()
      except KeyboardInterrupt:
         print('Keyboard Interrupt. Ending bot.')
         running = False
      except Exception as e:
         print('Exception raised per below. Attempting to continue bot in 10 seconds.')
         print(e)
         time.sleep(10)

What would be the best logic in this loop to ensure the health checks only runs every 45 minutes? Meanwhile the rest of the script would continue and process. Additionally, what is the best way to also ensure that if for some reason it does not run on the 45th minute (say at xx:45:00) such as perhaps CPU is busy elsewhere, it runs at the next opportunity?

The logic should be:

Considerations could be if minute == 45, but that alone has issues (it would run at least 60 times in the minute).

Upvotes: 0

Views: 1594

Answers (5)

Life is complex
Life is complex

Reputation: 15619

There are multiple Python modules available to accomplish your required needs. Some of these modules include:

Modules:

Advanced Python Scheduler (APScheduler)

schedule

timeloop

In 2019, I posted an answer for another question on using a scheduler in Python. Here is that question and my answer.

Concerning your question here are two ways to tackle the problem.

Timeloop Example:

import time
from datetime import timedelta
from timeloop import Timeloop

tl = Timeloop()

# @tl.job(interval=timedelta(minutes=45))
@tl.job(interval=timedelta(minutes=1))
def health_checks():
    print('running a health check')
    print("job current time : {}".format(time.ctime()))

def bot_loop():
    # timeloop is designed to run on a separate thread
    tl.start()
    while True:
        try:
           print ('running bot')
           time.sleep(10)
         except KeyboardInterrupt:
             print('Keyboard Interrupt. Ending bot.')
             tl.stop()
          except Exception as e:
              print('Exception raised per below. Attempting to continue bot in 10 seconds.')
              print(e)
              time.sleep(10)


if __name__ == "__main__":
   bot_loop()

Schedule Example:

import schedule
import time

def health_checks():
    print('running a health check')
    print("job current time : {}".format(time.ctime()))

def bot_loop():
    while True:
        try:
           print ('running bot')
           time.sleep(10)
         except KeyboardInterrupt:
             print('Keyboard Interrupt. Ending bot.')
             schedule.CancelJob()
         except Exception as e:
             print('Exception raised per below. Attempting to continue bot in 10 seconds.')
             print(e)
             time.sleep(10)


if __name__ == "__main__":
  # schedule.every(45).minutes.do(health_checks)
  schedule.every(1).minutes.do(health_checks)
  while True:
      schedule.run_pending()

  bot_loop()

Upvotes: 1

VPfB
VPfB

Reputation: 17247

The inner core of the program looks like:

for data in read_from_data_stream():
      process(data)

where read_from_data_stream() is provided by some 3rd party library and yields some kind of data as it arrives. In addition to the service above, a health_check() function should be called every 45 minutes.

The problem is that we want to do two activities at the same time. This is not possible in a regular single threaded program. But is it really a problem?

We could do at least some health checking:

CHECKTIME = 45*60.0  # in seconds

last_check = time.monotonic()
for data in read_from_data_stream():
    process(data)
    now = time.monotonic()
    if now > last_check + CHECKTIME:
        health_check()
        last_check = now

and it could be sufficient, even almost equivalent to the original specification, if data is coming often, say at least every few seconds or so.

But even if new data is not available for longer periods of time, it might be still acceptable. If there is no activity, no data processing, the health_check could be unnecessary, because nothing has changed since the last one.

The code can be further improved if the read_from_data_stream() offers a timeout option:

while True:
    try:
        for data in read_from_data_stream(timeout=CHECKTIME):
            ... for loop body as above ...
    except DataTimeoutError:
        ... run extra health_check ...
        continue

If the solution above is not good enough, there are two options. Use an async version of the library, if available, or spawn a new thread:

The main thread runs the loop, the additional thread runs the health check:

while True:
    time.sleep(CHECKTIME)
    health_check()

but almost certainly the health_check() and process(data) are accessing the same internal data structures, making a mutex lock mandatory.

Upvotes: 1

hrokr
hrokr

Reputation: 3549

Two options come to mind:

  1. The sched event handler.
  2. The very similar named schedule by Dan Bader of Real Python
import schedule
import time

def job():
    print("I'm working...")

schedule.every(45).minutes.do(job)
schedule.every().hour.do(job)
schedule.every().day.at("10:30").do(job)
schedule.every(5).to(10).minutes.do(job)
schedule.every().monday.do(job)
schedule.every().wednesday.at("13:15").do(job)
schedule.every().minute.at(":17").do(job)

while True:
    schedule.run_pending()
    time.sleep(1)

Upvotes: 2

Nima
Nima

Reputation: 381

You need multi-threading for that purpose.

import threading


def main_code():
    print("Doing health checks and other stuff...")
    threading.Timer(45*60, main_code).start()

Call main_code at start of bot_loop. The code schedules itself to run 45 minutes later each time it is called.

Executing periodic actions in Python

What is the best way to repeatedly execute a function every x seconds?

Upvotes: 1

Ehtesham Siddiqui
Ehtesham Siddiqui

Reputation: 333

Try using celery. With celery, you can relaunch the subsequent task with eta=45 minutes when the task finishes.

PS. I am not writting the entire snippet, but just a skeleton. You can use max_retries etc for multiple attempts in case of failure

@task
def my_task(...):
     ....
     my_task.delay(args=..., eta=45 * 60)

Upvotes: 3

Related Questions