Reputation: 1445
I am trying to make concurrent API calls with python. I based my code on the solution (first answer) presented in this thread: What is the fastest way to send 100,000 HTTP requests in Python?
Currently, my code is broken. I have a main function which creates the queue, populates it, initiates the threads, starts them, and joins the queue. I also have a target function which should make the get requests to the API.
The difficulties I am experiencing right now is that the target function does not execute the necessary work. The target is called, but it acts as the queue is empty. The first print is executed ("inside scraper worker"), while the second ("inside scraper worker - queue NOT empty") is not.
def main_scraper(flights):
print("main scraper was called, got: ")
print(flights)
data = []
q = Queue()
map(q.put, flights)
for i in range(0, 5):
t = Thread(target = scraper_worker, args = (q, data))
t.daemon = True
t.start()
q.join()
return data
def scraper_worker(q, data):
print("inside scraper worker")
while not q.empty():
print("inside scraper worker, queue not empty")
f = q.get()
url = kiwi_url(f)
response = requests.get(url)
response_data = response.json()
results = parseResults(response_data)
q.task_done()
print("task done. results:")
print(results)
#f._price = results[0]["price"]
#f._url = results[0]["deep_link"]
data.append(results)
return data
I hope this is enough information for you to help me out. Otherwise, I will rewrite the code in order to create a code that can be run by anyone.
Upvotes: 2
Views: 7295
Reputation: 342
I would guess that the flights are not being put on the queue. map(q.put, flights)
is lazy, and is never accessed so it is as if it didn't happen. I would just iterate.
def main_scraper(flights):
print("main scraper was called, got: ")
print(flights)
data = []
q = Queue()
for flight in flights:
q.put(flight)
for i in range(0, 5):
t = Thread(target = scraper_worker, args = (q, data))
t.daemon = True
t.start()
q.join()
return data
Upvotes: 2