George Livanoss
George Livanoss

Reputation: 599

javascript, parse an html string and recognize parts

I have a string like this:

"this is a string with <em>html</em> values"

and I need to recognize the em part and the before and after part of the html tags.

can I use javascript to split the string into an array, as example:

["this is a string with", "html", "values"]

Upvotes: 1

Views: 458

Answers (2)

Shersher
Shersher

Reputation: 87

The other answer is useful if you want to navigate between html nodes, but if you are interested in manipulating strings instead, you can look into regular expressions. See for instance https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions for an introduction.

// Input string
let str = 'this is a string with <em>html</em> values';

// The reg exp matching what you asked. 
let reg_exp = /(.+)<em>(.+)<\/em>(.+)/;

// Verify that the input string matches the result
if(reg_exp.test(str)) {
    
  // Extract the value captured by the parenthesis
  let output = [RegExp.$1, RegExp.$2, RegExp.$3];
  
  // Contains ["this is a string with", "html", "values"]
  console.log(output);
}

Upvotes: 1

ROOT
ROOT

Reputation: 11622

You can use DOMParser to parse the string into HTML, and once you create an instance and pass the string to it, you can get the generated document childNodes then iterate through them using .forEach(), note how I check the nodes we are iterating through for #text as a name for the node, since this check is for text nodes and not actual HTML tags:


let domparser = new DOMParser();
let doc = domparser.parseFromString('this is a string with <em>html</em> values', 'text/html');

doc.body.childNodes.forEach(function(node) {
  if (node.nodeName === "#text") {
    console.log(node.nodeValue);
  } else {
    console.log(node);
  }
});

Upvotes: 4

Related Questions