Ron I
Ron I

Reputation: 4250

Require http:// or https:// in beginning of regular expression

My regular expression works great except I want to add the requirement that the url must contain http:// or https:// and my attempts to fix the regular expression breaks it. Any help is appreciated!

    let rx = new RegExp(/((([A-Za-z]{3,9}:(?:\/\/)?)(?:[\-;:&=\+\$,\w]+@)?[A-Za-z0-9\.\-]+|(?:www\.|[\-;:&=\+\$,\w]+@)[A-Za-z0-9\.\-]+)((?:\/[\+~%\/\.\w\-_]*)?\??(?:[\-\+=&;%@\.\w_]*)#?(?:[\.\!\/\\\w]*))?)/g);
let replaced = text.replace(
        rx,
        `<a href="$1" target="_blank">$1</a>`
    ); 

Upvotes: 2

Views: 1704

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626794

Replace [A-Za-z]{3,9}:(?:\/\/)? with https?:\/\/. And since you are not using any variables to build the regex, use a regex literal and not the constructor notation.

/\bhttps?:\/\/(?:(?:[-;:&=+$,\w]+@)?[A-Za-z0-9.-]+|(?:www\.|[-;:&=+$,\w]+@)[A-Za-z0-9.-]+)(?:\/[+~%\/.\w_-]*\??(?:[-+=&;%@.\w_]*)#?[.!\/\\\w]*)?/g

See the online regex demo

The \bhttps?:\/\/, matching https:// or http:// as a whole word, will apply to both the alternatives in (?:(?:[-;:&=+$,\w]+@)?[A-Za-z0-9.-]+|(?:www\.|[-;:&=+$,\w]+@)[A-Za-z0-9.-]+), (?:[-;:&=+$,\w]+@)?[A-Za-z0-9.-]+ and (?:www\.|[-;:&=+$,\w]+@)[A-Za-z0-9.-]+, thus requiring the protocol before www., too.

There are also many redundant capturing groups, consider removing those not needed and converting to non-capturing ones those you need to quantify.

let text = "https://example.com http://example.com/more?k=v ftp://not.this.one www.example www.example.com";
let rx = /\bhttps?:\/\/(?:(?:[-;:&=+$,\w]+@)?[A-Za-z0-9.-]+|(?:www\.|[-;:&=+$,\w]+@)[A-Za-z0-9.-]+)(?:\/[+~%\/.\w_-]*\??(?:[-+=&;%@.\w_]*)#?[.!\/\\\w]*)?/g;
let replaced = text.replace(rx,'<a href="$&" target="_blank">$&</a>'); 
console.log(replaced);

The $& backreference refers to the whole regex match value, no need to wrap the whole pattern with an additional pair of capturing parentheses.

Upvotes: 5

Related Questions