hzhu
hzhu

Reputation: 3799

Clojure Regex: If string is a URL, return string

How can I return a valid URL given a string in Clojure.

 (re-matches #"????" "www.example.com"))
 (re-matches #"????" "http://example.com"))
 (re-matches #"????" "http://example.org")) // returns "http://example.org"
 (re-matches #"????" "htasdtp:/something")) // returns nil

Upvotes: 2

Views: 1536

Answers (2)

Marcs
Marcs

Reputation: 3838

Asking how to validate URLs in ClojureScript is basically asking how to do it in Javascript, as ClojureScript regular expressions compile to native JavaScript regular expressions.

This is a page with lots of variants on how to validate URLs using Regular Expressions: https://mathiasbynens.be/demo/url-regex

This is Diego Pierini's Javascript solution:

/^(?:(?:https?|ftp):\/\/)(?:\S+(?::\S*)?@)?(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,}))\.?)(?::\d{2,5})?(?:[/?#]\S*)?$/i

In ClojureScript:

(def url-pattern #"(?i)^(?:(?:https?|ftp)://)(?:\S+(?::\S*)?@)?(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,}))\.?)(?::\d{2,5})?(?:[/?#]\S*)?$")

(re-matches url-pattern "http://www.google.com")

Upvotes: 2

ntalbs
ntalbs

Reputation: 29458

Validating URL is not simple. Perhaps it's too complex to validate with regexp. Fortunately, there's a library called Apache Commons, which contains UrlValidator.

Since Clojure can use Java library, you can use Apache Commons' UrlValidator to validate URL in your program.

First, add dependency in your project.clj. Add the following line in your dependency vector.

[commons-validator "1.4.1"]

And then, you can define a function, valid-url? which returns boolean.

(import 'org.apache.commons.validator.UrlValidator)

(defn valid-url? [url-str]
  (let [validator (UrlValidator.)]
    (.isValid validator url-str)))

Now, you can do what you want with this function. Or you can modify the above function to return the URL string when it's argument is valid URL.

Upvotes: 11

Related Questions