user2002495
user2002495

Reputation: 2146

Javascript replace tag but preserve content

Say i have a text like this:

This should also be extracted, <strong>text</strong>

I need the text only from the entire string, I have tried this:

r = r.replace(/<strong[\s\S]*?>[\s\S]*?<\/strong>/g, "$1"); but failed (strong is still there). Is there any proper way to do this?

Expected Result

This should also be extracted, text

Solution:

To target specific tag I used this:

r = r.replace(/<strong\b[^>]*>([^<>]*)<\/strong>/i, "**$1**")

Upvotes: 0

Views: 308

Answers (2)

T.J. Crowder
T.J. Crowder

Reputation: 1074555

To parse HTML, you need an HTML parser. See this answer for why.

If you just want to remove <strong> and </strong> from the text, you don't need parsing, but of course simplistic solutions tend to fail, which is why you need an HTML parser to parse HTML. Here's a simplistic solution that removes <strong> and </strong>:

str = str.replace(/<\/?strong>/g, "")

var yourString = "This should also be extracted, <strong>text</strong>";
yourString = yourString.replace(/<\/?strong>/g, "")
display(yourString);

function display(msg) {
  // Show a message, making sure any HTML tags show
  // as text
  var p = document.createElement('p');
  p.innerHTML = msg.replace(/&/g, "&amp;").replace(/</g, "&lt;");
  document.body.appendChild(p);
}

Back to parsing: In your case, you can easily do it with the browser's parser, if you're on a browser:

var yourString = "This should also be extracted, <strong>text</strong>";
var div = document.createElement('div');
div.innerHTML = yourString;
display(div.innerText || div.textContent);

function display(msg) {
  // Show a message, making sure any HTML tags show
  // as text
  var p = document.createElement('p');
  p.innerHTML = msg.replace(/&/g, "&amp;").replace(/</g, "&lt;");
  document.body.appendChild(p);
}

Most browsers provide innerText; Firefox provides textContent, which is why there's that || there.

In a non-browser environment, you'll want some kind of DOM library (there are lots of them).

Upvotes: 3

Amit Joki
Amit Joki

Reputation: 59252

You can do this

var r = "This should also be extracted, <strong>text</strong>";
r = r.replace(/<(.+?)>([^<]+)<\/\1>/,"$2");
console.log(r);

I have just included some strict regex. But if you want relaxed version, you can very well do

r = r.replace(/<.+?>/g,"");

Upvotes: 2

Related Questions