Bourne
Bourne

Reputation: 1927

URLencoding in HTTP request for space

Why is the space character URL encoded to %20? I don't see a reason why space is considered to be a reserved character.

Upvotes: 1

Views: 1746

Answers (3)

CodeCaster
CodeCaster

Reputation: 151720

Because the Request-Line of an HTTP request is defined as:

Method (Space) Request-URI (Space) HTTP-Version CRLF

Naive HTTP servers that stricly adhere to the spec will do something like this:

splitInput = requestLine.Split(' ')

method = splitInput[0]
requestUri = splitInput[1]
httpVersion = splitInput[2]

That will break if you'd allow spaces in an URL.

Upvotes: 2

pataluc
pataluc

Reputation: 589

because space is used as a separator in a lot of cases (program with arguments, HTTP commands, etc), so it often has to be escaped, with a \ in unix command line, with surroundings " in a windows command line, with %20 in URLs, etc.

in HTTP protocol, when you try to reach http://www.foo.com, your browser opens a connection to the server www.foo.com on port 80, and send the commands:

GET http://www.foo.com HTTP/1.0    
Accept : text/html

The syntax is "METHOD URL HTTPVERSION"

If you tried to request http://www.foo.com/my page.html instead of http://www.foo.com/my%20page.html, the server would think "page.html" is the HTTPVersion you're looking for...

Upvotes: 5

tckmn
tckmn

Reputation: 59343

See RFC 3986 Section 2.3:

2.3.  Unreserved Characters

   Characters that are allowed in a URI but do not have a reserved
   purpose are called unreserved.  These include uppercase and lowercase
   letters, decimal digits, hyphen, period, underscore, and tilde.

      unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"

Upvotes: 2

Related Questions