Reputation: 407
I am trying to extract the img and src from a long html string.
I know there are a lot of questions about how to do this, but I have tried and gotten the wrong result. My question is just about contradicting results though.
I am using:
var url = "<img height=\"100\" src=\"data:image/png;base64,testurlhere\" width=\"200\"></img>";
var regexp = /<img[^>]+src\s*=\s*['"]([^'"]+)['"][^>]*>/g;
var src = url.match(regexp);
But this results in src not being extracted properly. I keep getting src =<img height="100" src="data:image/png;base64,testurlhere" width="200"></img>
instead of data:image/png;base64,testurlhere
However, when I try this on the regex tester at regex101, it extracts the src correctly. What am I doing wrong? Is match()
the wrong function to use>
Upvotes: 7
Views: 16748
Reputation: 19
const src = url.slice(url.indexOf("src")).split('"')[1]
Regex gives me headaches. Boohoo.
Find the index of the src in the HTML string (named var url in the question), then slice it from there, and finally split the array from the " 's. The second item in the array is your src link.
Upvotes: 1
Reputation: 345
If you need to get the whole img tags for some reason:
const imgTags = html.match(/<img [^>]*src="[^"]*"[^>]*>/gm);
then you can extract the source link for every img tag in array like this:
const sources = html.match(/<img [^>]*src="[^"]*"[^>]*>/gm)
.map(x => x.replace(/.*src="([^"]*)".*/, '$1'));
Upvotes: 23
Reputation: 388316
Not a big fan of using regex to parse html content, so here goes the longer way
var url = "<img height=\"100\" src=\"data:image/png;base64,testurlhere\" width=\"200\"></img>";
var tmp = document.createElement('div');
tmp.innerHTML = url;
var src = tmp.querySelector('img').getAttribute('src');
snippet.log(src)
<!-- Provides the `snippet` object, see http://meta.stackexchange.com/a/242144/134069 -->
<script src="http://tjcrowder.github.io/simple-snippets-console/snippet.js"></script>
Upvotes: 5