Peter Krauss
Peter Krauss

Reputation: 13920

Allowing new line characters in javascript str.replace

This question is similar to "Allowing new line characters in javascript regex" but the solution /m not runs with str.replace. You can test the code below at this page

 <p id="demo"><i>I need to TRIM the italics here, 

  despite this line.</i>
 </p>

 <button onclick="myFunction()">Try it</button>

 <script>
 function myFunction()
 {
 var str=document.getElementById("demo").innerHTML; 
 var n=str.replace(/^(\s*)<i>(.+)<\/i>(\s*)$/m,"$1$2$3"); //tested also /s
 alert(str)
 document.getElementById("demo").innerHTML=n;
 }
 </script>

Upvotes: 0

Views: 134

Answers (2)

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89547

A way to avoid problems with newlines is to not use the dot, example:

var n=str.replace(/<i>([^<]+)<\/i>/,"$1");

I have replaced the dot by [^<] (all that is not a <, that include newlines)

the m modifier is not needed here, and you don't need to capture white characters too.

Note that my solution suppose that you don't have any < between <i> and </i>

In the other case, when you have nested tags for example, you can use this trick to avoid lazy quantifier:

var n=str.replace(/<i>((?:[^<]+|<+(?!\/i>)+)<\/i>/,"$1");

Upvotes: 0

Martin Ender
Martin Ender

Reputation: 44259

This answer is mostly to give you some insight into why your current approach does not work, and how you generally solve it.

The reason m doesn't help is that the other answer is wrong. This is not what m does. m simply makes the anchors match line beginnings and endings in addition to the string beginnings and endings. Some regex flavors have s for what you want to accomplish, but not ECMAScript. The simplest thing (and general solution) is to replace . (which matches everything except line breaks) with [\s\S] (which matches whitespace and non-whitespace, i.e. everything).

However, Casimir's approach is better in your case, as it avoids some other problems like greediness. Of course, as Casimir said, if there are tags in between the opening and closing <i> tags, then the approach will not work. In that case, something like <i>([\s\S]+?)</i> might be an option, but that's still not the full solution, in case you have nested i-tags or attributes in the opening tag, or capitalized I-tags and whatnot.

All in all, using regex to parse HTML is wrong! You should really use DOM manipulation. Especially, since you are using Javascript - THE language for DOM manipulation. What you should really do is traverse the DOM for all i tags in your demo element, and replace them with their inner HTML.

Upvotes: 1

Related Questions