Reputation: 1063
As I was working with WebClient
class, I noticed that a simple call like this
string downloadedString = new WebClient().DownloadString("http://whatever");
produced a string using an incorrect encoding, even though the response contained a proper Content-Type
header application/json; charset=utf-8
.
When I looked at the source code I found out that DownloadString
doesn't look at the response headers at all. Instead it uses request.ContentType
and if the charset is not present there, it uses the Encoding
property (which has to be set beforehand, otherwise it will be system's default).
It seems weird that we have to specifically tell the WebClient
object which encoding to use before sending the request (by adding a Content-Type
header or setting encoding directly). It becomes pointless to use DownloadString
: if we want the right encoding, we have to use DownloadData
or plain old WebRequest
and write code that parses response headers manually in order to get the correct response string.
Does anyone know the reason for such behavior?
Is there a better way in .NET to properly download HTTP string response, than manually parsing response Content-Type
?
Upvotes: 1
Views: 1654
Reputation: 7629
The WebClient source code seems to indicate that when you call DownloadString
it uses the request content type as the encoding for the response, which is weird, and probably a bug.
See this excellent answer to a similar question. It includes code that uses DownloadData
to get the response, then converts it to a string using the correct encoding, as specified in the response's Content-Type
header.
Upvotes: 1