Reputation: 47968
What characters (if any) can a web browser URL (http/https) not end with?
As far as I can tell, control characters aren't used e.g.
\0
nil.\t
tab.\n
newline.Is there a complete list of such characters?
Upvotes: 0
Views: 751
Reputation: 96507
There are three cases how a URI can end:
with the path component (if it has no query/fragment)
http://example.com/
http://example.com/path
http://example.com/path/path
with the query component (if it has no fragment)
http://example.com/?query
http://example.com/path?query
http://example.com/path/path?query
with the fragment component
http://example.com/#fragment
http://example.com/path#fragment
http://example.com/path/path#fragment
http://example.com/?query#fragment
http://example.com/path?query#fragment
http://example.com/path/path?query#fragment
The URI standard doesn’t place any restrictions on the end of these three components (Path, Query, Fragment), so the same characters are allowed that can appear anywhere else in the components:
space (from testing seems this is stripped)
URIs can have (multiple) space characters at the end (in all three cases), but they have to be percent-encoded. Spaces aren’t allowed unencoded, no matter where.
http://example.com/path-ending-with-four-spaces-%20%20%20%20
If a user agent tries to convert user input into a valid URI (i.e., percent-encoding all characters that can’t appear in the component), it might assume that trailing spaces aren’t intended to be part of the URI, and strip them.
The same goes for tab and newline characters. They can be part of URIs if percent-encoded.
Upvotes: 2