Reputation: 7035
It seems that JS is only replacing the last match of a regex string. It is most likely something wrong with my regex.
var data = "<div class=\"separator\" style=\"clear: both; text-align: center;\"><img border=\"0\" src=\"https://lh3.googleusercontent.com/-dJnr7dsB21Y/UEhijTGNweI/AAAAAAAAA_w/DWiWYDOHXBA/s640/blogger-image-438529929.jpg\" /></div><div class=\"separator\" style=\"clear: both; text-align: center;\"><img border=\"0\" src=\"https://lh4.googleusercontent.com/-B3wU95K7DLI/UEhiuu75V-I/AAAAAAAAA_4/lxzGd2WajNE/s640/blogger-image-1479348845.jpg\" /></div>";
var regex = /<img (.*)src=\"(.*)\" (.*)\/>/g;
console.log("Before: "+data);
console.log("After: "+data.replace(regex, "[$2]"));
It outputs the following;
Before: <normal before string>
After: <div class="separator" style="clear: both; text-align: center;">[https://lh4.googleusercontent.com/-B3wU95K7DLI/UEhiuu75V-I/AAAAAAAAA_4/lxzGd2WajNE/s640/blogger-image-1479348845.jpg]</div>
(only returning the last image)
This is node.js, by the way.
Upvotes: 2
Views: 622
Reputation: 5117
It's because you are using greedy matching that matches past the ending of a tag.
Try this instead:
var regex = /<img ([^>]*)src="([^"]*)" ([^>]*)\/>/g;
Instead of (.*), I used either ([^>]) or ([^"]) which are called negated character classes and a negated character class matches on a range that does NOT include the characters listed in the character class.
EDIT: removed escaping the double quotes, per Felix's comment, thx for the catch!
Upvotes: 4