Reputation: 529
I am creating a small dictionary, with additional option to use google translate. So here is the problem: when I receive the respond from Google and show it in a textbox I see some kind of strange symbols. Here is the code of the method which "asks" google:
public string TranslateText(string inputText, string languagePair)
{
string url = String.Format("http://www.google.com/translate_t?hl=en&ie=UTF8&text={0}&langpair={1}", inputText, languagePair);
WebClient webClient = new WebClient();
webClient.Encoding = System.Text.Encoding.UTF8;
// Get translated text
string result = webClient.DownloadString(url);
result = result.Substring(result.IndexOf("<span title=\"") + "<span title=\"".Length);
result = result.Substring(result.IndexOf(">") + 1);
result = result.Substring(0, result.IndexOf("</span>"));
return result.Trim();
}
..and calling this method like this(after translate button clicked):
string resultText;
string inputText = tbInputWord.Text.ToString();
if (inputText != null && inputText.Trim() != "")
{
ExtendedGoogleTranslate urlTranslate = new ExtendedGoogleTranslate();
resultText = urlTranslate.TranslateText(inputText, "en|bg");
tbOutputWord.Text = resultText;
}
So I am translating from English(en) to Bulgarian(bg) and encoding webClient with UTF8 so I think that I am missing something on caller code to parse resultText somehow before putting it to tbOutputWord textbox. I know that this code works, because if I choose to translate from English to French(for example) it shows the correct result.
Upvotes: 0
Views: 1505
Reputation: 4481
Somehow, Google doesn't respect the ie=UTF8
query parameter. We need to add some headers to our request so that UTF8 is returned:
WebClient webClient = new WebClient();
webClient.Encoding = System.Text.Encoding.UTF8;
webClient.Headers.Add(HttpRequestHeader.UserAgent, "Mozilla/5.0");
webClient.Headers.Add(HttpRequestHeader.AcceptCharset, "UTF-8");
Upvotes: 2