Reputation: 2124
I have some test data in the following format -
"lorem ipsum <img src='some_url' class='some_class' /> lorem ipsum <img src='some_url' class='some_class' /> ipsum <img src='some_url' class='some_class' />"
Now, my goal is to identify all the image tags along with their respective source urls and css classes and store them together with the remaining text in an ordered array like -
["lorem ipsum", {imageObject1}, "lorem ipsum", {imageObject2}, "ipsum", {imageObject3}]
Now for this I tried to create a sample regex
var regex = /(.*(<img\s+src=['"](.+)['"]\s+(class=['"].+['"])?\s+\/>)+?.*)+/ig
Now when I try this regex with the sample text i am getting -
regex.exec(sample_text) => [0:"lorem ipsum <img src='some_url1' class='some_class1' /> lorem ipsum <img src='some_url2' class='some_class2' /> ipsum <img src='some_url3' class='some_class3' />"
1:"lorem ipsum <img src='some_url1' class='some_class1' /> lorem ipsum <img src='some_url2' class='some_class2' /> ipsum <img src='some_url3' class='some_class3' />"
2:"<img src='some_url3' class='some_class3' />"
3:"some_url3"
4:"class='some_class3'"]
How in javascript can I transform the sample html text into an array of tagged html objects with their attributes.
Upvotes: 0
Views: 36
Reputation: 6561
Do not use regular expressions to parse HTML. Use a DOMParser to parse the string and then CSS queries to get the images from the DOM, it will be much more reliable and easier to read.
var html = "lorem ipsum <img src='some_url' class='some_class' /> lorem ipsum <img src='some_url' class='some_class' /> ipsum <img src='some_url' class='some_class' />"
var nodes = new DOMParser().parseFromString(html, "text/html").body.childNodes
That will get you almost what you wanted (just some empty Text nodes you can filter out).
Or do something a little bit more accurate like this in case you don't have just images and text in the HTML:
var images = new DOMParser().parseFromString(html, "text/html").querySelectorAll("img")
var array = new Map([...images].map(img => [img.previousSibling.nodeValue, img]))
Upvotes: 1