John Bachir
John Bachir

Reputation: 22711

Why doesn't URI.escape escape single quotes?

Why doesn't URI.escape escape single quotes?

URI.escape("foo'bar\" baz")
=> "foo'bar%22%20baz"

Upvotes: 12

Views: 5189

Answers (4)

Max Williams
Max Williams

Reputation: 32933

I know this has been answered, but what I wanted was something slightly different, and I thought I might as well post it up: I wanted to keep the "/" in the url, but escape all the other non-standard characters. I did it thus:

#public filename is a *nix filepath, 
#like `"/images/isn't/this a /horrible filepath/hello.png"`

public_filename.split("/").collect{|s| ERB::Util.url_encode(s)}.join("/")
=> "/images/isn%27t/this%20a%20/horrible%20filepath/hello.png"

I needed to escape the single quote as I was writing a cache invalidation for AWS Cloudfront, which didn't like the single quotes and expected them to be escaped. The above should make a uri which is more safe than the standard URI.escape but which still looks like a URI (CGI Escape breaks the uri format by escaping "/").

Upvotes: 0

King'ori Maina
King'ori Maina

Reputation: 4507

According to the docs, URI.escape(str [, unsafe]) uses a regexp that matches all symbols that must be replaced with codes. By default the method uses REGEXP::UNSAFE. When this argument is a String, it represents a character set.

In your case, to modify URI.escape to escape even the single quotes you can do something like this ...

reserved_characters = /[^a-zA-Z0-9\-\.\_\~]/
URI.escape(YOUR_STRING, reserved_characters)

Explanation: Some info on the spec ...

All parameter names and values are escaped using the [rfc3986] percent- encoding (%xx) mechanism. Characters not in the unreserved character set ([rfc3986] section 2.3) must be encoded. characters in the unreserved character set must not be encoded. hexadecimal characters in encodings must be upper case. text names and values must be encoded as utf-8 octets before percent-encoding them per [rfc3629].

Upvotes: 1

Foo L
Foo L

Reputation: 11137

This is an old question, but the answer hasn't been updated in a long time. I thought I'd update this for others who are having the same problem. The solution I found was posted here: use ERB::Util.url_encode if you have the erb module available. This took care of single quotes & * for me as well.

CGI::escape doesn't escape spaces correctly (%20) versus plus signs.

Upvotes: 4

molf
molf

Reputation: 74935

For the same reason it doesn't escape ? or / or :, and so forth. URI.escape() only escapes characters that cannot be used in URLs at all, not characters that have a special meaning.

What you're looking for is CGI.escape():

require "cgi"
CGI.escape("foo'bar\" baz")
=> "foo%27bar%22+baz"

Upvotes: 11

Related Questions