Reputation: 8588
I was wondering if there was a way to check on the size of files you have a link to?
I have extracted the path to an image (with mechanize) from a site and want to put a condition on it that turns true or false depending on the file size.
page = Mechanize.new.get(http://www.someurl.com/).parser
image = page.search('//img[@id="img1"]/@src').text
Now, what I want to do is checking for the file size of image
.
For a local file I could do something like File.size
to get its size in bytes. Is there any way to check the size of image
?
Upvotes: 4
Views: 1979
Reputation: 27207
I think the Mechanize#head method will work:
image_size = Mechanize.new.head( image_url )["content-length"].to_i
HTTP HEAD
requests are a lesser known cousin of HTTP GET
, where the server is expected to respond with the same headers as if performing the GET request, but does not include the body. It is used often in web caching.
Example taken from Mobile Phones/eBay (requested by Arup Rakshit)
start_url = 'http://www.ebay.in/sch/Mobile-Phones-/15032/i.html'
crawler = Mechanize.new
page = crawler.get( start_url ).parser
image_url = page.search('//img/@src').first.text
image_size = crawler.head( image_url )["content-length"].to_i
=> 4244
Upvotes: 6