Reputation: 629
I'm creating an API service which allows people to provide a URL of an image to the API call, and the the service downloads the image to process.
How do I ensure somebody does NOT give me the URL of, like, a 5MB image? Is there a way to limit the request?
This is what I have so far, which basically grabs everything.
req = Net::HTTP::Get.new(url.path)
res = Net::HTTP.start(url.host, url.port) { |http|
http.request(req)
}
Thanks, Conrad
Upvotes: 4
Views: 3332
Reputation: 19146
Combining the other two answers, I'd like to 1) check the size header, 2) watch the size of chunks, while also 3) supporting https and 4) aggressively enforcing a timeout. Here's a helper I came up with:
require "net/http"
require 'uri'
module FetchUtil
# Fetch a URL, with a given max bytes, and a given timeout
def self.fetch_url url, timeout_sec=5, max_bytes=5*1024*1024
uri = URI.parse(url)
t0 = Time.now.to_f
body = ''
Net::HTTP.start(uri.host, uri.port,
:use_ssl => (uri.scheme == 'https'),
:open_timeout => timeout_sec,
:read_timeout => timeout_sec) { |http|
# First make a HEAD request and check the content-length
check_res = http.request_head(uri.path)
raise "File too big" if check_res['content-length'].to_i > max_bytes
# Then fetch in chunks and bail on either timeout or max_bytes
# (Note: timeout won't work unless bytes are streaming in...)
http.request_get(uri.path) do |res|
res.read_body do |chunk|
raise "Timeout error" if (Time.now().to_f-t0 > timeout_sec)
raise "Filesize exceeded" if (body.length+chunk.length > max_bytes)
body += chunk
end
end
}
return body
end
end
Upvotes: 2
Reputation: 31
Another one way to limit downloading size (full code should check response status, exception handling etc. It's just an example)
Net::HTTP.start(uri.host, uri.port) do |http|
request = Net::HTTP::Get.new uri.request_uri
http.request request do |response|
# check response codes here
body=''
response.read_body do |chunk|
body += chunk
break if body.size > MY_SAFE_SIZE_LIMIT
end
break
end
end
Upvotes: 2
Reputation: 33249
cwninja unfortunately gave you an answer that will only work for accidental attacks. An intelligent attacker will have no trouble at all defeating that check. There are two main reasons his method should not be used. First, nothing guarantees that the information in a HEAD response will match the corresponding GET response. A properly behaving server certainly will do this, but a malicious actor does not have to follow the spec. The attacker could simply send a HEAD response that says it has a Content-Length that's less than your threshold, but then hand you a huge file in the GET response. But that doesn't even cover the potential for a server to send back a response with the Transfer-Encoding: chunked header set. A chunked response could quite possibly never end. A few people pointing your server at never-ending responses could carry out a trivial resource-exhaustion attack, even if your HTTP client enforces a timeout.
To do this correctly, you need to use an HTTP library that allows you to count the bytes as they're received, and abort if it crosses the threshold. I would probably recommend Curb for this rather than Net::HTTP. (Can you even do this at all with Net::HTTP?) If you use the on_body and/or on_progress callbacks, you can count the incoming bytes and abort mid-response if you receive a file that's too large. Obviously, as cwninja already pointed out, if you receive a Content-Length header larger than your threshold, you want to abort for that too. Curb is also notably faster than Net::HTTP.
Upvotes: 8
Reputation: 9778
Try running this first:
Net::HTTP.start(url.host, url.port) { |http|
response = http.request_head(url.path)
raise "File too big." if response['content-length'].to_i > 5*1024*1024
}
You still have a race condition (someone could swap out the file after you do the HEAD
request), but in the simple case this asks the server for the headers it would send back from a GET
request.
Upvotes: 2