Reputation: 10621
I need to make sure that urls without a valid scheme are caught and an http://
added in front of them.
I know I could search for http://
in the variable. But it would be better practice if there was some facility to check for a valid scheme properly. Is there a way to check for valid schemes?
var a = "www.example.com";
Upvotes: 1
Views: 171
Reputation: 1075567
This is complicated by the fact that what you've shown is a perfectly valid relative URL. Differentiating relative URLs from broken absolute URLs necessarily involves some guesswork.
If you know you're not dealing with relative URLs, then by definition we're dealing with links that should have a scheme, and so we can look for ones that don't. If we refer to RFC-3986 §3 we see that the scheme is separated from the rest of the URL by a :
:
The following are two example URIs and their component parts: foo://example.com:8042/over/there?name=ferret#nose \_/ \______________/\_________/ \_________/ \__/ | | | | | scheme authority path query fragment | _____________________|__ / \ / \ urn:example:animal:ferret:nose
...and from §3.1, we can see that schemes are a series of an alpha followed by zero or more alphanums, +
, -
, or .
, seprated from the rest of the URL by a ://
:
scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )
So you can tell with a simple regular expression:
if (!/^[a-z][a-z0-9+.\-]*:/i.test(a)) {
// 'a' doesn't start with a scheme
}
But again, only if you're not dealing with relative URLs.
That regex says:
^
- start of input[a-z]
- any alpha char as normally defined in RFCs (e.g., English letters A through Z)[a-z0-9+.\-]*
- zero or more (that's the *
at the end) of any alpha, digit, +
, .
, or -
.:
- a colon...and the i
flag means "case-insensitive."
Upvotes: 2