Keya
Keya

Reputation: 1225

Validating URL in Java

I wanted to know if there is any standard APIs in Java to validate a given URL? I want to check both if the URL string is right i.e. the given protocol is valid and then to check if a connection can be established.

I tried using HttpURLConnection, providing the URL and connecting to it. The first part of my requirement seems to be fulfilled but when I try to perform HttpURLConnection.connect(), 'java.net.ConnectException: Connection refused' exception is thrown.

Can this be because of proxy settings? I tried setting the System properties for proxy but no success.

Let me know what I am doing wrong.

Upvotes: 122

Views: 199329

Answers (12)

arkon
arkon

Reputation: 2296

Using only standard API, pass the string to a URL object then convert it to a URI object. This will accurately determine the validity of the URL according to the RFC2396 standard.

Example (version < 20):

public boolean isValidURL(String url) {

    try {
        new URL(url).toURI();
    } catch (MalformedURLException | URISyntaxException e) {
        return false;
    }

    return true;
}

As per Yann39's comment below..

Example (version >= 20):

public boolean isValidURL(String url) {

    try {
        URI.create(url).toURL();
    } catch (MalformedURLException | IllegalArgumentException e) {
        return false;
    }

    return true;
}

Upvotes: 27

blackorbs
blackorbs

Reputation: 625

For basic need you can use:

private boolean isValidUrl(String urlString){
        try {
            URL url = new URL(urlString);
            url.toURI();
            return Patterns.WEB_URL.matcher(urlString).matches();
        } catch (MalformedURLException | URISyntaxException e) {
            return false;
        }
    }

If you want to ensure url is reachable, you may need to do network call in background thread. I used rxjava:

new SingleFromCallable<>(new Callable<Boolean>() {
            @Override
            public Boolean call() {
                return URLIsReachable(urlString);
            }
        })
                .subscribeOn(Schedulers.io())
                .observeOn(AndroidSchedulers.mainThread())
                .subscribe(new SingleObserver<Boolean>() {
                    @Override
                    public void onSubscribe(@io.reactivex.rxjava3.annotations.NonNull Disposable d) {}

                    @Override
                    public void onSuccess(@io.reactivex.rxjava3.annotations.NonNull Boolean isValidUrl) {
                        loadingDialog.dismiss();
                        if(isValidUrl) // do your stuff
                    }

                    @Override
                    public void onError(@io.reactivex.rxjava3.annotations.NonNull Throwable e) {
                        loadingDialog.dismiss();
                    }
                });
private boolean URLIsReachable(String urlString) {
        try {
            URL url = new URL(urlString);
            HttpURLConnection urlConnection = (HttpURLConnection) url.openConnection();
            int responseCode = urlConnection.getResponseCode();
            urlConnection.disconnect();
            return responseCode == 200;
        } catch (IOException e) {
            e.printStackTrace();
            return false;
        }
    }

Upvotes: 0

giacomello
giacomello

Reputation: 19

This is what I use to validate CDN urls (must start with https, but that's easy to customise). This will also not allow using IP addresses.

public static final boolean validateURL(String url) {  
    var regex = Pattern.compile("^[https:\\/\\/(www\\.)?a-zA-Z0-9@:%._\\+~#=]{2,256}\\.[a-z]{2,6}\\b([-a-zA-Z0-9@:%_\\+.~#?&//=]*)");
    var matcher = regex.matcher(url);
    return matcher.find();
}

Upvotes: -1

Genaut
Genaut

Reputation: 1851

I think the best response is from the user @b1nary.atr0phy. Somehow, I recommend combine the method from the b1nay.atr0phy response with a regex to cover all the possible cases.

public static final URL validateURL(String url, Logger logger) {

        URL u = null;
        try {  
            Pattern regex = Pattern.compile("(?i)^(?:(?:https?|ftp)://)(?:\\S+(?::\\S*)?@)?(?:(?!(?:10|127)(?:\\.\\d{1,3}){3})(?!(?:169\\.254|192\\.168)(?:\\.\\d{1,3}){2})(?!172\\.(?:1[6-9]|2\\d|3[0-1])(?:\\.\\d{1,3}){2})(?:[1-9]\\d?|1\\d\\d|2[01]\\d|22[0-3])(?:\\.(?:1?\\d{1,2}|2[0-4]\\d|25[0-5])){2}(?:\\.(?:[1-9]\\d?|1\\d\\d|2[0-4]\\d|25[0-4]))|(?:(?:[a-z\\u00a1-\\uffff0-9]-*)*[a-z\\u00a1-\\uffff0-9]+)(?:\\.(?:[a-z\\u00a1-\\uffff0-9]-*)*[a-z\\u00a1-\\uffff0-9]+)*(?:\\.(?:[a-z\\u00a1-\\uffff]{2,}))\\.?)(?::\\d{2,5})?(?:[/?#]\\S*)?$");
            Matcher matcher = regex.matcher(url);
            if(!matcher.find()) {
                throw new URISyntaxException(url, "La url no está formada correctamente.");
            }
            u = new URL(url);  
            u.toURI(); 
        } catch (MalformedURLException e) {  
            logger.error("La url no está formada correctamente.");
        } catch (URISyntaxException e) {  
            logger.error("La url no está formada correctamente.");  
        }  

        return u;  

    }

Upvotes: -1

dened
dened

Reputation: 4310

There is a way to perform URL validation in strict accordance to standards in Java without resorting to third-party libraries:

boolean isValidURL(String url) {
  try {
    new URI(url).parseServerAuthority();
    return true;
  } catch (URISyntaxException e) {
    return false;
  }
}

The constructor of URI checks that url is a valid URI, and the call to parseServerAuthority ensures that it is a URL (absolute or relative) and not a URN.

Upvotes: 11

Yonatan
Yonatan

Reputation: 2531

For the benefit of the community, since this thread is top on Google when searching for
"url validator java"


Catching exceptions is expensive, and should be avoided when possible. If you just want to verify your String is a valid URL, you can use the UrlValidator class from the Apache Commons Validator project.

For example:

String[] schemes = {"http","https"}; // DEFAULT schemes = "http", "https", "ftp"
UrlValidator urlValidator = new UrlValidator(schemes);
if (urlValidator.isValid("ftp://foo.bar.com/")) {
   System.out.println("URL is valid");
} else {
   System.out.println("URL is invalid");
}

Upvotes: 172

penduDev
penduDev

Reputation: 4785

Use the android.webkit.URLUtil on android:

URLUtil.isValidUrl(URL_STRING);

Note: It is just checking the initial scheme of URL, not that the entire URL is valid.

Upvotes: 8

Martin
Martin

Reputation: 2552

The java.net.URL class is in fact not at all a good way of validating URLs. MalformedURLException is not thrown on all malformed URLs during construction. Catching IOException on java.net.URL#openConnection().connect() does not validate URL either, only tell wether or not the connection can be established.

Consider this piece of code:

    try {
        new URL("http://.com");
        new URL("http://com.");
        new URL("http:// ");
        new URL("ftp://::::@example.com");
    } catch (MalformedURLException malformedURLException) {
        malformedURLException.printStackTrace();
    }

..which does not throw any exceptions.

I recommend using some validation API implemented using a context free grammar, or in very simplified validation just use regular expressions. However I need someone to suggest a superior or standard API for this, I only recently started searching for it myself.

Note It has been suggested that URL#toURI() in combination with handling of the exception java.net. URISyntaxException can facilitate validation of URLs. However, this method only catches one of the very simple cases above.

The conclusion is that there is no standard java URL parser to validate URLs.

Upvotes: 43

Keya
Keya

Reputation: 1225

Thanks. Opening the URL connection by passing the Proxy as suggested by NickDK works fine.

//Proxy instance, proxy ip = 10.0.0.1 with port 8080
Proxy proxy = new Proxy(Proxy.Type.HTTP, new InetSocketAddress("10.0.0.1", 8080));
conn = new URL(urlString).openConnection(proxy);

System properties however doesn't work as I had mentioned earlier.

Thanks again.

Regards, Keya

Upvotes: -3

Doc Davluz
Doc Davluz

Reputation: 4250

Just important to point that the URL object handle both validation and connection. Then, only protocols for which a handler has been provided in sun.net.www.protocol are authorized (file, ftp, gopher, http, https, jar, mailto, netdoc) are valid ones. For instance, try to make a new URL with the ldap protocol:

new URL("ldap://myhost:389")

You will get a java.net.MalformedURLException: unknown protocol: ldap.

You need to implement your own handler and register it through URL.setURLStreamHandlerFactory(). Quite overkill if you just want to validate the URL syntax, a regexp seems to be a simpler solution.

Upvotes: 1

Olly
Olly

Reputation: 7758

You need to create both a URL object and a URLConnection object. The following code will test both the format of the URL and whether a connection can be established:

try {
    URL url = new URL("http://www.yoursite.com/");
    URLConnection conn = url.openConnection();
    conn.connect();
} catch (MalformedURLException e) {
    // the URL is not in a valid form
} catch (IOException e) {
    // the connection couldn't be established
}

Upvotes: 33

NickDK
NickDK

Reputation: 5197

Are you sure you're using the correct proxy as system properties?

Also if you are using 1.5 or 1.6 you could pass a java.net.Proxy instance to the openConnection() method. This is more elegant imo:

//Proxy instance, proxy ip = 10.0.0.1 with port 8080
Proxy proxy = new Proxy(Proxy.Type.HTTP, new InetSocketAddress("10.0.0.1", 8080));
conn = new URL(urlString).openConnection(proxy);

Upvotes: 0

Related Questions