Reputation: 1286
I have the following regex to detect URLS:
/(\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])/ig
However, it doesn't detect urls such as www.google.ca
and tlk.tc/ApSE
. Is there an regex where I can detect these URLs? I am using javascript.
Upvotes: 1
Views: 628
Reputation: 7604
Edit:
Try this one:
((\b(https?|ftp|file):\/\/)?[-A-Z0-9+&@#\/%?=~_|!:,.;]+\.[-A-Z0-9+&@#\/%=~_|]+)
It makes the scheme optional, to support the two cases that you show in your example.
The IETF RFC-2396 for URLs gives the following regular expression for parsing URLs:
^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?
which maps the capture groups to the following components:
scheme = $2
authority = $4
path = $5
query = $7
fragment = $9
Note that the examples you give, www.google.ca
and tlk.tc/ApSE
are not "valid" URLs, but I believe they'd be matched by the regex anyway.
Upvotes: 3
Reputation: 15958
This expression does what you want. It is not a valid URL which this regexp is matching, but it fits your requirements:
/(\b(https?|ftp|file):\/\/|\bwww\.[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])|([\S]+\.([a-z]{2,})+?\/[\S]+)/gi
Upvotes: 0