user337620
user337620

Reputation: 2409

Detect redirect with ruby mechanize

I am using the mechanize/nokogiri gems to parse some random pages. I am having problems with 301/302 redirects. Here is a snippet of the code:

agent = Mechanize.new
page = agent.get('http://example.com/page1')

The test server on mydomain.com will redirect the page1 to page2 with 301/302 status code, therefore I was expecting to have

page.code == "301"

Instead I always get page.code == "200".

My requirements are:

I know that I can see the page1 in agent.history, but that's not reliable. I want the redirect status code also.

How can I achieve this behavior with mechanize?

Upvotes: 16

Views: 9870

Answers (2)

pguardiario
pguardiario

Reputation: 54984

You could leave redirect off and just keep following the location header:

agent.redirect_ok = false
page = agent.get 'http://www.google.com'
status_code = page.code

while page.code[/30[12]/]
  page = agent.get page.header['location']
end

Upvotes: 26

user337620
user337620

Reputation: 2409

I found a way to allow redirects and also get the status code, but I'm not sure it's the best method.

agent = Mechanize.new

# deactivate redirects first
agent.redirect_ok = false

status_code = '200'
error_occurred = false

# request url
begin
  page = agent.get(url)
  status_code = page.code
rescue Mechanize::ResponseCodeError => ex
  status_code = ex.response_code
  error_occurred = true
end

if !error_occurred && status_code != '200' then
  # enable redirects and request the page again
  agent.redirect_ok = true
  page = agent.get(url)
end

Upvotes: 4

Related Questions