now_world
now_world

Reputation: 1096

How do I split a string of HTML into an array of words and tags

How do I split a string of HTML into an array so that each word is an item in the array including the tags that surround it?

//So this string:
var myHTMLString = "Something, something <span @click='changeSelected(0)' id='0' class='wrong'>else</span> is foo <span @click='changeSelected(0)' id='0' class='wrong'>hello world</span> to all.";

//Would become this:
var HTMLAry = ["Something,", "something", "<span @click='changeSelected(0)' id='0' class='wrong'>else</span>", "is", "foo", "<span @click='changeSelected(0)' id='0' class='wrong'>hello world</span>", "to", "all."];

Things we can rely on:

How can I achieve this?

The only thing I can think of that might work for this is some sort of regex, however other somewhat similar answers have said that in most cases you should stay away from regex when working with HTML tags. But regex is the only thing I can imagine that would work.

var myHTMLString = "Something, something <span @click='changeSelected(0)' id='0' class='wrong'>else</span> is foo <span @click='changeSelected(0)' id='0' class='wrong'>hello world</span> to all.";

//This^ would become this:

var HTMLAry = ["Something,", "something", "<span @click='changeSelected(0)' id='0' class='wrong'>else</span>", "is", "foo", "<span @click='changeSelected(0)' id='0' class='wrong'>hello world</span>", "to", "all."];
    
console.log(myHTMLString.match(/<span.*?>.*?<\/span\>/g));

Upvotes: 0

Views: 833

Answers (1)

dave
dave

Reputation: 64657

Create an element, set the elements html to your string, get the child nodes, split the text nodes on spaces and filter out empties, get the outerHTML of other nodes, then flatten the array.

var myHTMLString = "Something, something <span @click='changeSelected(0)' id='0' class='wrong'>else</span> is foo <span @click='changeSelected(0)' id='0' class='wrong'>hello world</span> to all.";

var el = document.createElement('div');

el.innerHTML = myHTMLString;

var arr = Array.from(el.childNodes).map(e => e.outerHTML || e.nodeValue.split(' ').filter(t => t));

console.log([].concat.apply([], arr))

Upvotes: 2

Related Questions