user13637625
user13637625

Reputation:

Looping and checking if its time for refresh.. is threading a solution for this?

I have a loop that scrapes through 30 links over and over and exactly every 10 hours or so or so I prefer to clear other list which keeps certain found data. Here's an abstract example. If I had time refresher in a separate thread.

def check_refresh(time_since_refresh):
        time_difference = round((datime.now() - time_since_refresh).total_seconds()/60/60)
        if time_difference == CLEAR_FOUND:
            time_since_refresh = datetime.now()
            return True
        return False


while True:
  scrape(url)
  if check_refresh():
     temporary_list.clear()

The thing is, speed of scraping is important for me, so if it goes through each loop checking if it is time to refresh I feel like it will slow it down.
Shoud time for refresh be a separate thread and having a boolean flag that the scraper will read through each loop? Also is there a better way to implement how much time elapsed since beginning of loop without my "hack" with check_refresh?

Upvotes: 0

Views: 21

Answers (1)

Krishnan Shankar
Krishnan Shankar

Reputation: 942

Yes, threading is a better idea. Try this:

def check_refresh(time_since_refresh):
        time_difference = round((datime.now() - 
                time_since_refresh).total_seconds()/60/60)
        if time_difference == CLEAR_FOUND:
            time_since_refresh = datetime.now()
            return True
        return False

t = threading.Thread(target=check_refresh)
while True:
  scrape(url)
  if t:
     temporary_list.clear()

The reason threading is better because a thread does something while another code is being executed, instead of pausing the currently executing code. Hope this helps!

Upvotes: 1

Related Questions