Reputation: 101
I need a regex to scan JS files for any image paths it finds.
These paths would generally be nested as follows:
$img1 = "foo/bar.png";
$img2 = 'foo/bar.jpg';
$img3 = "{'myimg':'foo/bar.png'}";
I need a regex which will be able to pick up the whole image path inside the quotes, but sometimes nested inside a json string, or otherwise encoded... essentially, matching a whole image path by detecting just the existence of the extension (jpg|png|gif).
I have found a regex that works well in php, I need one that works with javascript.
$pattern = '~/?+(?>[^"\'/]++/)+[^"\'\s]+?\.(?>(?>pn|jpe?)g|gif)\b~';
How the form of regex pattern in javascript?
Upvotes: 1
Views: 855
Reputation: 163577
Possessive quantifiers ++
and atomic groups (?>
are not supported in Javascript.
The updated pattern could look like this:
\/?(?:[^"'/]+\/)+[^"'\s]+?\.(?:(?:pn|jpe?)g|gif)\b
But to get those matches and if //
in the path is also ok, you can exclude matching the quotes using a negated character class [^"']*
only
Note to escape the \/
as the regex delimiters in Javscript are /
and that you don't have to escape the '
and "
in the character class.
The shorter version could look like
[^"']+\.(?:(?:pn|jpe?)g|gif)\b
[^"']+
Match any char except '
or "
1+ times\.
Match a dot(?:
Non capture group
(?:pn|jpe?)g
Match either png jpg or jpeg|
Orgif
Match literally)\b
Close non capture group followed by a word boundaryconst regex = /[^"']+\.(?:(?:pn|jpe?)g|gif)\b/;
[
"foo/bar.png",
"foo/bar.jpg",
"{'myimg':'foo/bar.png'}"
].forEach(s => console.log(s.match(regex)[0]));
Upvotes: 2
Reputation: 18641
I'd use
string.match(/[^"'<>]+\.(?:png|jpe?g|gif)\b/gi)
See proof. Note: g
- all occurrences, i
- case insensitive, <>
added to the expression to limit matching up to a tag.
Explanation
--------------------------------------------------------------------------------
[^"'<>]+ any character except: '"', ''', '<', '>'
(1 or more times (matching the most amount
possible))
--------------------------------------------------------------------------------
\. '.'
--------------------------------------------------------------------------------
(?: group, but do not capture:
--------------------------------------------------------------------------------
png 'png'
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
jp 'jp'
--------------------------------------------------------------------------------
e? 'e' (optional (matching the most amount
possible))
--------------------------------------------------------------------------------
g 'g'
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
gif 'gif'
--------------------------------------------------------------------------------
) end of grouping
--------------------------------------------------------------------------------
\b the boundary between a word char (\w) and
something that is not a word char
Upvotes: 2