UnknownHostException Android utf-8 encoding URL

String url = String.format("http://%s.jpg.to", URLEncoder.encode("свинья", "utf-8"));
new URL(url).openStream();
Document doc = Jsoup.connect(url).get();

I want to read web page with russian symbols in URL, but catch exception (Android 4.1.1):

W/System.err: java.net.UnknownHostException: http://%D1%81%D0%B2%D0%B8%D0%BD%D1%8C%D1%8F.jpg.to
W/System.err:     at libcore.net.http.HttpConnection$Address.<init>(HttpConnection.java:283)
W/System.err:     at libcore.net.http.HttpConnection.connect(HttpConnection.java:128)
W/System.err:     at libcore.net.http.HttpEngine.openSocketConnection(HttpEngine.java:315)
W/System.err:     at libcore.net.http.HttpEngine.connect(HttpEngine.java:310)
W/System.err:     at libcore.net.http.HttpEngine.sendSocketRequest(HttpEngine.java:289)
W/System.err:     at libcore.net.http.HttpEngine.sendRequest(HttpEngine.java:239)
W/System.err:     at libcore.net.http.HttpURLConnectionImpl.connect(HttpURLConnectionImpl.java:80)
W/System.err:     at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:563)
W/System.err:     at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:540)
W/System.err:     at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:227)
W/System.err:     at org.jsoup.helper.HttpConnection.get(HttpConnection.java:216)
W/System.err:     at test.jpgto.MainActivity$RetrieveImageTask.doInBackground(MainActivity.java:63)
W/System.err:     at test.jpgto.MainActivity$RetrieveImageTask.doInBackground(MainActivity.java:49)
W/System.err:     at android.os.AsyncTask$2.call(AsyncTask.java:287)
W/System.err:     at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:305)
W/System.err:     at java.util.concurrent.FutureTask.run(FutureTask.java:137)
W/System.err:     at android.os.AsyncTask$SerialExecutor$1.run(AsyncTask.java:230)
W/System.err:     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1076)
W/System.err:     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:569)
W/System.err:     at java.lang.Thread.run(Thread.java:856)

But link http://2.jpg.to/ (for example) works fine. What i doing wrong?

Upvotes: 0

Views: 352

Answers (2)

ishmaelMakitla
ishmaelMakitla

Reputation: 3812

What happens when you simply put the characters as they are on the URL? For instance, try something like this:

String host = "свинья";    
//here we now do string-formatting and then call the convertUrlToPunycodeIfNeeded which uses IDN
String url= convertUrlToPunycodeIfNeeded(String.format("http://%s.jpg.to", host));
//then simply use the URL
new URL(url).openStream();
Document doc = Jsoup.connect(url).get();

Below is the code showing how to use java.net.IDN in your case:

    //The translation of characters to their Latin equivalent
   public static String convertUrlToPunycodeIfNeeded(String url) {
        if (!Charset.forName("US-ASCII").newEncoder().canEncode(url)) {
            if (url.toLowerCase().startsWith("http://")) {
                url = "http://" + IDN.toASCII(url.substring(7));
            } else if (url.toLowerCase().startsWith("https://")) {
                url = "https://" + IDN.toASCII(url.substring(8));
            } else {
                url = IDN.toASCII(url);
            }
        }
        return url;
    }

I have found this nice example here - Example 1:

Upvotes: 1

            String url = String.format("http://%s.jpg.to", IDN.toASCII("свинья"));

we are see link http://xn--b1ampn2ds.jpg.to

Upvotes: 1

Related Questions