M.Veli
M.Veli

Reputation: 529

Converting strings in c# from google translate

I am creating a small dictionary, with additional option to use google translate. So here is the problem: when I receive the respond from Google and show it in a textbox I see some kind of strange symbols. Here is the code of the method which "asks" google:

public string TranslateText(string inputText, string languagePair)
    {
        string url = String.Format("http://www.google.com/translate_t?hl=en&ie=UTF8&text={0}&langpair={1}", inputText, languagePair);

        WebClient webClient = new WebClient();
        webClient.Encoding = System.Text.Encoding.UTF8;

        // Get translated text
        string result = webClient.DownloadString(url);

        result = result.Substring(result.IndexOf("<span title=\"") + "<span title=\"".Length);
        result = result.Substring(result.IndexOf(">") + 1);
        result = result.Substring(0, result.IndexOf("</span>"));

        return result.Trim();
    }

..and calling this method like this(after translate button clicked):

string resultText;
string inputText = tbInputWord.Text.ToString();

if (inputText != null && inputText.Trim() != "")
{
     ExtendedGoogleTranslate urlTranslate = new ExtendedGoogleTranslate();

     resultText = urlTranslate.TranslateText(inputText, "en|bg");

     tbOutputWord.Text = resultText;
 }

So I am translating from English(en) to Bulgarian(bg) and encoding webClient with UTF8 so I think that I am missing something on caller code to parse resultText somehow before putting it to tbOutputWord textbox. I know that this code works, because if I choose to translate from English to French(for example) it shows the correct result.

Upvotes: 0

Views: 1505

Answers (1)

Frank
Frank

Reputation: 4481

Somehow, Google doesn't respect the ie=UTF8 query parameter. We need to add some headers to our request so that UTF8 is returned:

WebClient webClient = new WebClient();
webClient.Encoding = System.Text.Encoding.UTF8;
webClient.Headers.Add(HttpRequestHeader.UserAgent, "Mozilla/5.0");
webClient.Headers.Add(HttpRequestHeader.AcceptCharset, "UTF-8");

Upvotes: 2

Related Questions