Ivan
Ivan

Reputation: 103

Determine ruby thread state

I have a Ruby script fetching HTML pages over HTTP using threads:

require "thread"
require "net/http"

q = Queue.new
q << "http://google.com/"
q << "http://rubygems.org/"
q << "http://twitter.com/"
t = Thread.new do
  loop do
    html = Net::HTTP.get(URI(q.pop))
    p html.length
  end
end

10.times do
  puts t.status
  sleep 0.3
end

I'm trying to determine the state of the thread while it is fetching the content from given sources. Here is the output I got:

run
219
sleep
sleep
7255
sleep
sleep
sleep
sleep
sleep
sleep
65446
sleep

The thread is in "sleep" state almost all the time though it's actually working. I understand it's waiting for the HTTP class to retrieve the content. The last "sleep" is different: the thread tried to pop the value from the queue which is empty and switched to "sleep" state until there is something new in the queue.

I want to be able to check what's going on in the thread: Is it working on HTTP or simply waiting for new job to appear?

What is the right way to do it?

Upvotes: 3

Views: 785

Answers (1)

Wayne Conrad
Wayne Conrad

Reputation: 107979

The sleep state appears to cover both I/O wait and being blocked in synchronization, so you won't be able to use the thread state to know whether you're processing or waiting. Instead, you could use thread local storage for the thread to communicate that. Use Thread#[]= to store a value, and Thread#[] to get it back.

require "thread"
require "net/http"

q = Queue.new
q << "http://google.com/"
q << "http://rubygems.org/"
q << "http://twitter.com/"
t = Thread.new do
  loop do
    Thread.current[:status] = 'waiting'
    request = q.pop
    Thread.current[:status] = 'fetching'
    html = Net::HTTP.get(URI(request))
    Thread.current[:status] = 'processing'
    # Take half a second to process it.
    Time.new.tap { |start_time| while Time.now - start_time < 0.5 ; end }
    p html.length
  end
end

10.times do
  puts t[:status]
  sleep 0.3
end

I've added a short loop to eat up time. Without it, it's unlikely you'd see "processing" in the output:

219
processing
fetching
processing
7255
fetching
fetching
fetching
62471
processing
waiting
waiting

Upvotes: 4

Related Questions