Does a `+` in a URL scheme/host/path represent a space?

I am aware that a + in the query string of a URL represents a space. Is this also the case outside of the query string region? That is to say, does the following URL:

http://a.com/a+b/c

actually represent:

http://a.com/a b/c

(and thus need to be encoded if it should actually be a +), or does it in fact actually represent a+b/c?

Upvotes: 236

Views: 253147

Answers (6)

Maxim Masiutin
Maxim Masiutin

Reputation: 4782

Space characters may only be encoded as "+" in one context: application/x-www-form-urlencoded key-value pairs.

The RFC-1866 (HTML 2.0 specification), paragraph 8.2.1, subparagraph 1 says: "The form field names and values are escaped: space characters are replaced by "+", and then reserved characters are escaped").

Here is an example of such a string in URL where RFC-1866 allows encoding spaces as pluses: "http://example.com/over/there?name=foo+bar". So, only after "?", can spaces be replaced by pluses (in other cases, spaces should be encoded to "%20"). This way of encoding form data is also given in later HTML specifications, for example, look for relevant paragraphs about application/x-www-form-urlencoded in HTML 4.01 Specification, and so on.

But, because it's hard to always correctly determine the context, it's the best practice to never encode spaces as "+". It's better to percent-encode all characters except "unreserved" defined in RFC-3986, p.2.3. Here is a code example that illustrates what should be encoded. It is given in Delphi (pascal) programming language, but it is very easy to understand how it works for any programmer regardless of the language possessed:

(* percent-encode all unreserved characters as defined in RFC-3986, p.2.3 *)
function UrlEncodeRfcA(const S: AnsiString): AnsiString;
const    
  HexCharArrA: array [0..15] of AnsiChar = '0123456789ABCDEF';
var
  I: Integer;
  c: AnsiChar;
begin
 // percent-encoding, see RFC-3986, p. 2.1
  Result := S;
  for I := Length(S) downto 1 do
  begin
    c := S[I];
    case c of
      'A' .. 'Z', 'a' .. 'z', // alpha
      '0' .. '9',             // digit
      '-', '.', '_', '~':;    // rest of unreserved characters as defined in the RFC-3986, p.2.3
      else
        begin
          Result[I] := '%';
          Insert('00', Result, I + 1);
          Result[I + 1] := HexCharArrA[(Byte(C) shr 4) and $F)];
          Result[I + 2] := HexCharArrA[Byte(C) and $F];
        end;
    end;
  end;
end;

function UrlEncodeRfcW(const S: UnicodeString): AnsiString;
begin
  Result := UrlEncodeRfcA(Utf8Encode(S));
end;

Upvotes: 28

Stobor
Stobor

Reputation: 45122

  • Percent encoding in the path section of a URL is expected to be decoded, but
  • any + characters in the path component is expected to be treated literally.

To be explicit: + is only a special character in the query component.

https://www.rfc-editor.org/rfc/rfc3986

Upvotes: 182

Niels R.
Niels R.

Reputation: 7364

You can find a nice list of corresponding URL encoded characters on W3Schools.

  • + becomes %2B
  • space becomes %20

Upvotes: 238

Baryon Lee
Baryon Lee

Reputation: 1177

use encodeURIComponent function to fix url, it works on Browser and node.js

res.redirect("/signin?email="+encodeURIComponent("[email protected]"));


> encodeURIComponent("http://a.com/a+b/c")
'http%3A%2F%2Fa.com%2Fa%2Bb%2Fc'

Upvotes: 0

The Java Guy
The Java Guy

Reputation: 2291

Try below:

<script type="text/javascript">

function resetPassword() {
   url: "submitForgotPassword.html?email="+fixEscape(Stringwith+char);
}
function fixEscape(str)
{
    return escape(str).replace( "+", "%2B" );
}
</script>

Upvotes: -4

Lennart Koopmann
Lennart Koopmann

Reputation: 21059

Thou shalt always encode URLs.

Here is how Ruby encodes your URL:

irb(main):008:0> CGI.escape "a.com/a+b"
=> "a.com%2Fa%2Bb"

Upvotes: -6

Related Questions