Reputation: 1269
I am making an HTTP Client where I need to send HTTP get request to fetch data. I am using boost asio library, hence I have no way to use any standard url encoding library.
Here is what I got from netcat and Mozilla(a typical get request)
localhost:2000/questions/10838702/how-to-encode or-d ecode-url-in-objective-c
Get Request Url
F:\pydev>nc -l -p 2000
GET /questions/10838702/how-to-encode%20or-d%20%20%20ecode-url-in-objective-c HTTP/1.1
Host: localhost:2000
User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:11.0) Gecko/20100101 Firefox/11.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
I found Mozilla only encodes the query part of the url.
I tried this url encoding webpage http://meyerweb.com/eric/tools/dencoder/
And it encodes the following url
localhost:2000/questions/10838702/how-to-encode or-d ecode-url-in-objective-c
to
localhost%3A2000%2Fquestions%2F10838702%2Fhow-to-encode%20or-d%20%20%20ecode-url-in-objective-c
Can anyone suggest me where to use URL encoding ?
Upvotes: 1
Views: 240
Reputation: 11317
As a general rule, any character other than alphanumerics (A-Z0-9), - _ . and ~ either have some special purpose in a URL, or are not allowed.
Reserved characters are ; / ? : @& = and space. If you use any of those characters in a way other than their special meaning, then you must URL-encode it. To be safe, a lot of encoders just encode everything that isn't explicitly safe.
For example, let's say you have a file name with a question mark in it (let's name the file file?name
, and you need to create a URL. The problem is that http://somehost.com/file?name
will not be interpreted the way you want it to be. The url will match /file
in your web space, and have a search term of name
. You have to encode the file name to get the URL http://somehost.com/file%3Fname
.
The spec allows you to URL-encode any character, even alphanumerics, with the expectation that they will be un-encoded by the server. You just have to make sure that wherever reserved characters are used for their intended purpose, they are not encoded. eg: You don't want to encode the colon or slashes in http://somehost.com
because they are being used as delimters.
The most frequent use of url-encoding is to prepare form data. In this case you usually start with a set of key-value pairs. You would construct the encoded data for a form like so (in pseudocode):
encodedKey=encodedValue
. encKey1=encVal1&encKey2=encVal2
Decoding is the reverse process:
It sounds simple, but you might be shocked at how many people get it wrong.
I have glossed over some of the finer details here. As always, the relevant specification is the last word. In this case, RFC 1738.
Upvotes: 2