capsized
capsized

Reputation: 65

Downloading only the header of a file

I'm using mechanize for scraping a website which works nicely, however since you can't tell from a link what kind of file it is linking to e.g. http://somesite.com/images.php?get=123 is it possible to download the header only?

I'm asking this because I'd like to decide based on the filetype if I will download it. Also it would then help deciding on a filename when downloading.

It doesn't have to use mechanize but is there any Rails way of doing this?

Upvotes: 2

Views: 1106

Answers (2)

Nakilon
Nakilon

Reputation: 35093

This? http://ruby-doc.org/stdlib/libdoc/net/http/rdoc/classes/Net/HTTP.html#M000682

response = nil
Net::HTTP.start('some.www.server', 80) {|http|
    response = http.head('/index.html')
}
p response['content-type']

Upvotes: 3

hellvinz
hellvinz

Reputation: 3500

You can use curb

ruby-1.8.7-p174 > require 'rubygems'
 => true 
ruby-1.8.7-p174 > require 'curb'
 => true  
ruby-1.8.7-p174 > c = Curl::Easy.http_head('https://encrypted.google.com/images/logos/ssl_logo_lg.gif'){|easy| easy.follow_location = true}
ruby-1.8.7-p174 > c.perform
 => true
 => #<Curl::Easy https://encrypted.google.com/images/logos/ssl_logo>
ruby-1.8.7-p174 > c.content_type
 => "image/gif" 

Upvotes: 1

Related Questions