How to use schedule package to webscrape using python

Question

I did this script to do some web scraping from the "Clima tempo" website, and I used Beautiful soup to extract the information and saved it into pandas to export to excel.

But when using the schedule at the end of the code, the script doesn't run automatically.

Does have anything related to the fact to use web scraping? That is why is not working?

def extractInfo():
    #Make the request
    html = requests.get("https://www.climatempo.com.br/previsao-do-tempo/agora/cidade/321/riodejaneiro-rj").content
    now = BS(html, "lxml")
    
    html = requests.get("https://www.climatempo.com.br/previsao-do-tempo/cidade/321/riodejaneiro-rj/").content
    today = BS(html, "lxml")

.......



schedule.every().day.at("08:00").do(extractInfo)

while True:
    schedule.run_pending()
    time.sleep(1)

Here you can find the github link to check all the script https://github.com/Tayzerdo/Webscraping-from-climatempo/blob/main/WebScraping.py

AKX · Accepted Answer

The script itself can't make itself run automatically. The schedule package is meant for long-running programs which need to do things periodically.

Depending on your OS, you might want to look into

systemd timers/cronjobs (Linux)
cronjobs (macOS)
the Task Scheduler (Windows)

How to use schedule package to webscrape using python

Answers (1)

Related Questions