Reputation: 1927
Why is the space character URL encoded to %20? I don't see a reason why space is considered to be a reserved character.
Upvotes: 1
Views: 1746
Reputation: 151720
Because the Request-Line of an HTTP request is defined as:
Method (Space) Request-URI (Space) HTTP-Version CRLF
Naive HTTP servers that stricly adhere to the spec will do something like this:
splitInput = requestLine.Split(' ')
method = splitInput[0]
requestUri = splitInput[1]
httpVersion = splitInput[2]
That will break if you'd allow spaces in an URL.
Upvotes: 2
Reputation: 589
because space is used as a separator in a lot of cases (program with arguments, HTTP commands, etc), so it often has to be escaped, with a \ in unix command line, with surroundings " in a windows command line, with %20 in URLs, etc.
in HTTP protocol, when you try to reach http://www.foo.com
, your browser opens a connection to the server www.foo.com on port 80, and send the commands:
GET http://www.foo.com HTTP/1.0
Accept : text/html
The syntax is "METHOD URL HTTPVERSION"
If you tried to request http://www.foo.com/my page.html
instead of http://www.foo.com/my%20page.html
, the server would think "page.html" is the HTTPVersion you're looking for...
Upvotes: 5
Reputation: 59343
See RFC 3986 Section 2.3:
2.3. Unreserved Characters
Characters that are allowed in a URI but do not have a reserved
purpose are called unreserved. These include uppercase and lowercase
letters, decimal digits, hyphen, period, underscore, and tilde.
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
Upvotes: 2