Reputation: 117
I have this string :
<p><ins>Article </ins>Title</p>
<p>Here's some sample text</p>
I'd like to get words neglecting html tags to array, ie
['Article','Title','Here's','some','sample','text']
I tried to create a regex, but it wont succeed. Thanks in advance.
Upvotes: 0
Views: 579
Reputation: 20885
You don't need a regex for this, you can simply use the browser's API:
const html = "<p><ins>Article </ins>Title</p> <p>Here's some sample text</p>";
const div = document.createElement("div");
div.innerHTML = html;
// This will extract the text (remove the HTML tags)
const text = div.textContent || div.innerText || "";
console.log(text);
// Then you can simply split the string
const result = text.split(' ');
console.log(result);
Upvotes: 3
Reputation: 68433
Put them in a dummy div
and get innerText
var str = `<p><ins>Article </ins>Title</p>
<p>Here's some sample text</p>`;
var div = document.createElement( "div" );
div.innerHTML = str; //assign str as innerHTML
var text = div.innerText; //get text only
var output = text.split( /\s+/ ); //split by one or more spaces including line feeds
console.log( output );
Upvotes: 5