user2005848
user2005848

Reputation: 406

.Net URI is encoded incorrectly

I need to parse http://website.com/page?id=ABCD | EFG:

Dim WebR As HttpWebRequest = DirectCast(WebRequest.Create(URL), HttpWebRequest)

Any normal browser (Such as firefox) would encode the URL like this: http://website.com/page?id=ABCD%20|%20EFG

However, when using the code I provided or creating a new URI, the URL gets encoded into:http://website.com/page?id=ABCD%20%7C%20EFG

Which wont work for me because that id doesnt exist.

How can this be fixed?

Upvotes: 0

Views: 62

Answers (1)

Marc Gravell
Marc Gravell

Reputation: 1062745

The trick here is: don't start with an illegal url. If you are constructing the url yourself, it is your job to escape the components. For example:

string id = "ABCD | EFG"; // perhaps via some more complicated code
string url = "http://website.com/page?id=" + Uri.EscapeDataString(id);

This outputs, correctly, http://website.com/page?id=ABCD%20%7C%20EFG. This is the correct url. | is not a valid character in a url.

From https://www.rfc-editor.org/rfc/rfc3986#section-2, the unreserved characters are defined as:

unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"

where ALPHA is defined as A-Z and a-z, and DIGIT is defined as 0-9.

Anything else needs to be %-encoded.

Upvotes: 2

Related Questions