Reputation: 11
I'm using the following to pick up all https or ftp from within a large string
/(\b(https?|ftp):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])/gim;
I want to extend the functionality so as NOT to pick up any URL's that have a preceding src="
tag
Match:
https://xxx.yyy.com
No Match:
src="https://xxx.yyy.com
I've tried the negative look behind trying to match src="
with no success.
Upvotes: 1
Views: 78
Reputation: 9650
Lookbehinds are not supported in JavaScript. Yet you may solve this by explicitly matching the src="
in an optional group and then filter out all matches with that group matched:
var input = `Match: https://match.xxx.yyy.com
No Match: src="https://fail.xxx.yyy.com`;
var regex = /(src=")?\b(https?|ftp):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|]/gim;
var urls = [];
// collect only matches without `src="` prefix
input.replace(regex, function(match, src) { if (!src) {urls.push(match)} });
console.log(urls);
Upvotes: 0
Reputation: 1997
JavaScript regular expressions do not support lookbehinds.
One common way you could match strings like this is:
[^"]https:\/\/[a-z.]+
Although you should write more detailed regex for domain, and then simply skip first character to get URL. You can see here regex demo.
Upvotes: 1