Tayzer Damasceno
Tayzer Damasceno

Reputation: 302

How to use schedule package to webscrape using python

I did this script to do some web scraping from the "Clima tempo" website, and I used Beautiful soup to extract the information and saved it into pandas to export to excel.

But when using the schedule at the end of the code, the script doesn't run automatically.

Does have anything related to the fact to use web scraping? That is why is not working?

def extractInfo():
    #Make the request
    html = requests.get("https://www.climatempo.com.br/previsao-do-tempo/agora/cidade/321/riodejaneiro-rj").content
    now = BS(html, "lxml")
    
    html = requests.get("https://www.climatempo.com.br/previsao-do-tempo/cidade/321/riodejaneiro-rj/").content
    today = BS(html, "lxml")

.......



schedule.every().day.at("08:00").do(extractInfo)

while True:
    schedule.run_pending()
    time.sleep(1)

Here you can find the github link to check all the script https://github.com/Tayzerdo/Webscraping-from-climatempo/blob/main/WebScraping.py

Upvotes: 0

Views: 159

Answers (1)

AKX
AKX

Reputation: 169051

The script itself can't make itself run automatically. The schedule package is meant for long-running programs which need to do things periodically.

Depending on your OS, you might want to look into

  • systemd timers/cronjobs (Linux)
  • cronjobs (macOS)
  • the Task Scheduler (Windows)

Upvotes: 1

Related Questions