crazyhorse
crazyhorse

Reputation: 1

How do I automate a python script to run every hour in a Django website hosted on Heroku?

My project involves a Django website using data from a .csv file generated from a web scraping script, which needs to be hosted on Heroku. My development OS is Windows 10. When my development server is run, it initially executes the script under the main application's views.py file:

exec(open('homepage/scrape.py').read())

where homepage is the name of the main application of the project and scrape.py is the web scraping script.

What I need to occur is for this scrape.py to run every hour and be able to work on both a Heroku dyno and my Windows development environment.

Thanks.

Upvotes: 0

Views: 690

Answers (1)

James Tollefson
James Tollefson

Reputation: 893

I recently built an app with very similar functionality. The solution turned out to be quite simple, thankfully.

First, I created a clock.py file that had my actual scheduling functionality in it.

from apscheduler.schedulers.blocking import BlockingScheduler
from django import setup
from scrape import scrape #this is the package you referred to in your question, theoretically

setup() #got to make sure everything is running before this kicks in

@sched.scheduledc_job('interval', hours=1)
def hourly_scrape():
    update = scrape()

sched.start()

Then I added a separate dyno named clock in my Procfile to do the work.

clock: python clock.py --log-file -

As long as you update your requirements.txt and get another dyno online you'll be good to hook. Also, don't forget that you have to scale your dyno up. From the command line that'll look something like this:

$ heroku ps:scale clock=1

Upvotes: 1

Related Questions