Rafael Marques
Rafael Marques

Reputation: 1445

concurrent requests with queue and thread

I am trying to make concurrent API calls with python. I based my code on the solution (first answer) presented in this thread: What is the fastest way to send 100,000 HTTP requests in Python?

Currently, my code is broken. I have a main function which creates the queue, populates it, initiates the threads, starts them, and joins the queue. I also have a target function which should make the get requests to the API.

The difficulties I am experiencing right now is that the target function does not execute the necessary work. The target is called, but it acts as the queue is empty. The first print is executed ("inside scraper worker"), while the second ("inside scraper worker - queue NOT empty") is not.

def main_scraper(flights):
  print("main scraper was called, got: ")
  print(flights)
  data = []
  q = Queue()
  map(q.put, flights)
  for i in range(0,  5):
      t = Thread(target = scraper_worker, args = (q, data))
      t.daemon = True
      t.start()
  q.join()
  return data

def scraper_worker(q, data):
  print("inside scraper worker")
  while not q.empty():
    print("inside scraper worker, queue not empty")
    f = q.get()
    url = kiwi_url(f)
    response = requests.get(url)
    response_data = response.json()
    results = parseResults(response_data)
    q.task_done()
    print("task done. results:")
    print(results)
    #f._price = results[0]["price"]
    #f._url = results[0]["deep_link"]
    data.append(results)
  return data

I hope this is enough information for you to help me out. Otherwise, I will rewrite the code in order to create a code that can be run by anyone.

Upvotes: 2

Views: 7295

Answers (1)

Boyd Johnson
Boyd Johnson

Reputation: 342

I would guess that the flights are not being put on the queue. map(q.put, flights) is lazy, and is never accessed so it is as if it didn't happen. I would just iterate.

def main_scraper(flights):
  print("main scraper was called, got: ")
  print(flights)
  data = []
  q = Queue()
  for flight in flights:
      q.put(flight)
  for i in range(0,  5):
      t = Thread(target = scraper_worker, args = (q, data))
      t.daemon = True
      t.start()
  q.join()
  return data

Upvotes: 2

Related Questions