Thisisstackoverflow
Thisisstackoverflow

Reputation: 261

How to debug "pthread_cond_wait: Resource busy"?

I wrote a script that performs API calls using the python mulithreading library. It speeds up the processing by huge margins because the bottleneck is the network, not anything on my host (enter someone stating python doesn't do true multithreading here).

The issue is that sometimes when I run the script I receive this error, with my script eventually hanging/sleeping:

pthread_cond_wait: Resource busy

I have no idea how to figure out why this is happening. How do I get more context to debug the issue? Do I need to put print statements in a bunch of random places and hope to catch whatever issue is causing this? Is there a better way to debug?

If it helps, this is how I implemented the multithreading:

for i in range(threads): # make the threads
        t = threading.Thread(target=queue_worker, args=[apikey, q, retries, hit_threshold]) # The threads will use the "queue_worker" function with these parameters
        t.daemon = True
        t.start() # start the thread!
# Data is put onto the queue and queue_worker does the API work here...
...
q.join() # Clean up and close the threads when the threads are all idle (no more data on the queue)

EDIT:

queue_worker, api and main code is basically this:

def queue_worker(apikey, q, retries, hit_threshold)
   api_data = q.get()
   for x in range(retries)
      try:
         response = do_api(api_data, apikey)
      except Exception as error:
         time.sleep(5)
         continue
   else:
      error_count = error_count + 1
      q.task_done()
      continue
   #... data parsing code here...
   #... printing parsed data to screen here if a particular value returned is greater than "hit_threshold"...
   q.task_done()

def do_api(api_data, apikey)
   params = { 'apikey': apikey, 'resource': api_data }
   response = requests.get('https://MYURL.com/api', params=params, timeout=10)
   return response

if __name__ == '__main__':
   threads = 50
   q = Queue.Queue(threads)
   for i in range(threads): # make the threads
      t = threading.Thread(target=queue_worker, args=[apikey, q, retries, hit_threshold]) # The threads will use the "queue_worker" function with these parameters
      t.daemon = True
      t.start() # start the thread!
   # Data is put onto the queue and queue_worker does the API work here...
   ...
   q.join() # Clean up and close the threads when the threads are all idle (no more data on the queue)

Upvotes: 1

Views: 1909

Answers (1)

stovfl
stovfl

Reputation: 15533

Comment: Any tips on debugging?

  1. Doublecheck your code using own Locks, Condition or other threading Functions for nested usage.
  2. Use own Locks for accessing shared Variables.
  3. Read Python Threads and the Global Interpreter Lock and try this "work around".
    There are other ways to accelerate the GIL manipulation or avoid it:

    • call ''time.sleep()'' - set ''sys.setcheckinterval()'' - run Python in optimized mode - dump process-intensive tasks into C-extensions - use the subprocess module to execute commands

Likely, you are facing Python GIL!

what-is-a-global-interpreter-lock-gil

One of the other threads has the lock.
There is inconsistent use of locking.

Upvotes: 1

Related Questions