holdenmcgrohen
holdenmcgrohen

Reputation: 1063

C# - WebClient.DownloadString does not detect response encoding

As I was working with WebClient class, I noticed that a simple call like this

string downloadedString = new WebClient().DownloadString("http://whatever");

produced a string using an incorrect encoding, even though the response contained a proper Content-Type header application/json; charset=utf-8.

When I looked at the source code I found out that DownloadString doesn't look at the response headers at all. Instead it uses request.ContentType and if the charset is not present there, it uses the Encoding property (which has to be set beforehand, otherwise it will be system's default).

It seems weird that we have to specifically tell the WebClient object which encoding to use before sending the request (by adding a Content-Type header or setting encoding directly). It becomes pointless to use DownloadString: if we want the right encoding, we have to use DownloadData or plain old WebRequest and write code that parses response headers manually in order to get the correct response string.

Does anyone know the reason for such behavior? Is there a better way in .NET to properly download HTTP string response, than manually parsing response Content-Type?

Upvotes: 1

Views: 1654

Answers (1)

greg84
greg84

Reputation: 7629

The WebClient source code seems to indicate that when you call DownloadString it uses the request content type as the encoding for the response, which is weird, and probably a bug.

See this excellent answer to a similar question. It includes code that uses DownloadData to get the response, then converts it to a string using the correct encoding, as specified in the response's Content-Type header.

Upvotes: 1

Related Questions