Reputation: 29777
This URL takes you to an image, but has no file extension to check a regex against:
http://www.tonymooreillustration.com/gallery/main.php?g2_view=core.DownloadItem&g2_itemId=393
I'm using Restclient (HTTP and REST client for Ruby) in my app, so I tried doing this:
RestClient.get "http://www.tonymooreillustration.com/gallery/main.php?g2_view=core.DownloadItem&g2_itemId=393"
I get back lots of text that begins like this:
"\377???JFIF\000\001\002\001\000H\000H\000\000\377?cExif\000\000MM\000*\000\000\000\b\000\a\001\022\000\003\000\000\000\001\000\001\000\000\001\032\000\005\000\000\000\001\000\000\000b\001\e\000\005\000\000\000\001\000\000\000j\001(\000\003\000\000\000\001\000\002\000\000\0011\000\002\000\000\000\024\000\000\000r\0012\000\002\000\000\000\024\000\000\000\206\207i\000\004\000\000\000\001\000\000\000\234\000\000\000?\000\000H\000\000\000\001\000\000\000H\000\000\000\001Adobe Photoshop 7.0\0002005:07:12 02:58:19\000\000\000\000\003\240\001\000\003\000\000\000\001\377\377\000\000\240\002\000\004\000\000\000\001\000\000\001?\000\004\000\000\000\001\000\000\002?\000\000\000\000\000\006\001\003\000\003\000\000\000
Is there a way I can use this to determine if the URL is pointing at an image?
Upvotes: 2
Views: 401
Reputation: 3916
Use FastImage - it'll grab the minimum require data from the URL to determine if it's an image, what type of image and the size.
Upvotes: 0
Reputation: 8785
Your best bet is the Content-Type
header, but if all else fails you can derive the image format from the initial set of bytes:
Search for <format> file format
, replacing <format>
with the other file formats you need to identify.
Upvotes: 1
Reputation: 42267
I did this about 5 years ago in php. Sadly I don't have the code any more. Basically I used curl with an option to follow all redirects. That way the data that was being returned to the program was nearly always what I actually wanted to test. From that point, I would only grab the first few bytes of data from the content and check if the image meta data existed and whether or not it was jpg, png, or gif. Having the code to show you would probably help to explain this a lot better, but its gone. I likened this to "tasting" a file before eating it.
The advantage of using this kind of technique is that you're actually checking the file instead of relying on header info, which is usually correct but not always.
Upvotes: 0
Reputation: 66263
It looks like the REST Client response wraps Ruby's Net::HTTPResponse
so if res
is the result from RestClient.get
you should be able to do:
res.net_http_res.header['content-type']
and see if this starts with image/
e.g. image/jpeg
for a JPEG image.
If you don't actually need a copy of the image and just need to check what the URL points to then you are better to do a HEAD
request as reto suggests. (this avoids receiving an unnecessary copy of the body content.)
Upvotes: 2
Reputation: 16732
You could do a HEAD request and check the header for MIME information.
See: http://ruby-doc.org/stdlib/libdoc/net/http/rdoc/classes/Net/HTTP.html#M000682
The response you get in your example is the image itself. You also try do determine wether or not this is a picture by using a utility like file [1] or with image library like imagemagick [2].
[1] http://unixhelp.ed.ac.uk/CGI/man-cgi?file [2] http://rmagick.rubyforge.org/
Upvotes: 2