Reputation: 65
I'm using mechanize for scraping a website which works nicely, however since you can't tell from a link what kind of file it is linking to e.g. http://somesite.com/images.php?get=123 is it possible to download the header only?
I'm asking this because I'd like to decide based on the filetype if I will download it. Also it would then help deciding on a filename when downloading.
It doesn't have to use mechanize but is there any Rails way of doing this?
Upvotes: 2
Views: 1106
Reputation: 35093
This? http://ruby-doc.org/stdlib/libdoc/net/http/rdoc/classes/Net/HTTP.html#M000682
response = nil
Net::HTTP.start('some.www.server', 80) {|http|
response = http.head('/index.html')
}
p response['content-type']
Upvotes: 3
Reputation: 3500
You can use curb
ruby-1.8.7-p174 > require 'rubygems'
=> true
ruby-1.8.7-p174 > require 'curb'
=> true
ruby-1.8.7-p174 > c = Curl::Easy.http_head('https://encrypted.google.com/images/logos/ssl_logo_lg.gif'){|easy| easy.follow_location = true}
ruby-1.8.7-p174 > c.perform
=> true
=> #<Curl::Easy https://encrypted.google.com/images/logos/ssl_logo>
ruby-1.8.7-p174 > c.content_type
=> "image/gif"
Upvotes: 1