TNT
TNT

Reputation: 3620

Ruby: URI::InvalidURIError (URI must be ascii only

require 'uri'
uri = URI.parse 'http://dxczjjuegupb.cloudfront.net/wp-content/uploads/2017/10/Оуэн-Мэтьюс.jpg'

The browsers have no problem with http://dxczjjuegupb.cloudfront.net/wp-content/uploads/2017/10/Оуэн-Мэтьюс.jpg so I'm asking myself if this ruby class is a little bit outdated? And should I completely renounce it or do some error handling…

Upvotes: 44

Views: 23287

Answers (8)

Siva Praveen
Siva Praveen

Reputation: 2333

https://bibwild.wordpress.com/2023/02/14/escaping-encoding-uri-components-in-ruby-3-2/

TLDR;

Ruby 3.2

require 'cgi'
 
url = "https://example.com/some/#{ CGI.escapeURIComponent path_component }" + 
  "?#{CGI.escapeURIComponent my_key}=#{CGI.escapeURIComponent my_value}"

< Ruby 3.2

require 'cgi'
CGI.escape(input).gsub("+", "%20")

or

require 'erb'
ERB::Util.url_encode(input)

Upvotes: -1

noraj
noraj

Reputation: 4622

TL;DR

Ruby < 3.0 (not recommended)

uri = URI.parse(URI.escape(url))

Ruby > 2.0 (recommended)

uri = URI.parse(URI::Parser.new.escape(url))

Explanation

URI.escape / URI.encode has been removed since Ruby 3.0. This solution offers to use pure Ruby uri module rather than relaying on an third-party gem.

Upvotes: 14

Guss
Guss

Reputation: 32315

With kudus to all the URI.escape answers (also known as URI.encode), these methods have been officially made obsolete by Ruby 2.7 - i.e. they now produce a visible URI.escape is obsolete warning message when you use them - previously they have just been deprecated. In Ruby 3.0 these methods have been completely removed and are no longer available at all - not even with a warning.

Unfortunately, as far as I can tell, the Ruby's standard library URI class does not offer any alternative for handling URIs containing non-ASCII characters, which are all so common these days - <sarcasm>now that the web had gone international</sarcasm>.

The best solution I came up with is using the addressable gem that contains the URI class we deserve - it handles everything the world has to throw at it and you can get an "HTTP safe" URI using the #display_uri method:

Addressable::URI.parse("http://example.com/Оуэн-Мэтьюс.jpg")
=> #<Addressable::URI:0xc8 URI:http://example.com/Оуэн-Мэтьюс.jpg>
Addressable::URI.parse("http://example.com/Оуэн-Мэтьюс.jpg").display_uri.to_s
=> "http://example.com/%D0%9E%D1%83%D1%8D%D0%BD-%D0%9C%D1%8D%D1%82%D1%8C%D1%8E%D1%81.jpg"

Addressable::URI also comes with all kinds of goodies, such as port inferral (you can tell whether the URL originally contained a port specification, or you can not care), and URL canonicalization (given a base URL, take a possibly relative URL and generate an absolute URL).

Here's how to use this with net/http:

response = Net::HTTP.start(url.host, url.inferred_port, 
        :use_ssl => url.scheme == 'https') do |http|
    req = Net::HTTP::Get.new(url.display_uri.request_uri)
end

Upvotes: 20

Sun Soul
Sun Soul

Reputation: 33

URI.encode('your-url')

This worked for me

Upvotes: -1

David Morales
David Morales

Reputation: 18064

You can map the URL characters and escape the ones that are not ASCII. Something like this:

url.chars.map { |char| char.ascii_only? ? char : CGI.escape(char) }.join

Upvotes: 4

roee
roee

Reputation: 908

What do you think about:

url = URI.escape(url) unless url.ascii_only?
URI.parse(url)

Upvotes: 9

Jose Paez
Jose Paez

Reputation: 847

I had the same error:

Ruby: URI::InvalidURIError (URI must be ascii only

with my code, but my bug was that it was an old project and the i18n was outdated. It was solved, with a simple:

bundle update

Upvotes: 1

TNT
TNT

Reputation: 3620

The answer just came to me by asking myself the question:

begin
  uri = URI.parse(url)
rescue URI::InvalidURIError
  uri = URI.parse(URI.escape(url))
end

Upvotes: 49

Related Questions