Reputation: 490

Regular expression to capture a tag

I have the following html text and in javascript i need to caputure all the tags "p" that have a class "page-break" and then replace it for any text.

I need use regular expression beacuse this html text is going to be processed like a text not like a DOM estrucutre

<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Praesent pellentesque tincidunt adipiscing</p>

<p class="page-break">break</p>

<p>Suspendisse a velit at diam facilisis
egestas sit amet a lectus.</p>

<p class="page-break">other</p>

<p>Donec tristique placerat massa vitae hendrerit. Maecenas nec
massa adipiscing sem venenatis vehicula. Suspendisse mattis pretium
libero quis dignissim. Nulla volutpat imperdiet vehicula. Donec ut
tristique neque.</p>

What prevent me to use a dom parser is than i plan to insert a not valid html element i plan transform the previus HTML into this, i need to parse firt like a text to skip html validation and then paste it like this

 <div class="pag visible">
 <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit.
    Praesent pellentesque tincidunt adipiscing</p>
 </div>
 <div class="pag">   
    <p>Suspendisse a velit at diam facilisis
    egestas sit amet a lectus.</p>
 </div>
 <div class="pag">   
    <p>Donec tristique placerat massa vitae hendrerit. Maecenas nec
    massa adipiscing sem venenatis vehicula. Suspendisse mattis pretium
    libero quis dignissim. Nulla volutpat imperdiet vehicula. Donec ut
    tristique neque.</p>
 </div>

as you can see every ".page-break" is replace ir

Upvotes: 1

Answers (4)

ted

Reputation: 5329

// your content
var content = '<p>Lorem ips...';
// to match any <p> with correspondent class
var regex = /(<p class.?=.?"page-break">.*<\/p>)+/g;
// to replace it with whatever you need:
content.replace(regex, "<p>MY TEXT HERE</p>");

Example

Upvotes: 1

Vaman Kulkarni

Reputation: 3451

It is not advisable to parse HTML with regex. You can use XPath for fetching all the  with specified criteria and iterate over the returned list and update the textContent for each  as shown in below snippet.

var pList = document.evaluate("//p[@class='page-break']", document, null, XPathResult.ANY_TYPE, null);   
var item = pList.iterateNext();  
while (item) {  
    item.textContent = "New Text";
    item = pList.iterateNext();  
}

Explanation

//p[@class='page-break'] will fetch all the  elements with class='page-break'. document.evaluate function will return you object of type XPathResult. Using interateNext() function you can get its element. You can set new text using textContent property.

Upvotes: 0

Oleg V. Volkov

Reputation: 22421

Don't use regexp to parse HTML. Use DOM instead. If you have plain string, create a DocumentFragment and assign it to its .innerHTML to get DOM.

Find your p tags with getElementsByTagName, check their .className and act accordingly.

Upvotes: 4

Rich Andrews

Reputation: 4188

Have you thought of using JQuery?

$('p').hasClass('page-break').html('replacement value goes here');

this will replace the contents of  with "replacement value goes here"

or $('p').hasClass('page-break').remove(); will remove the  element entirely.

Upvotes: 0

Regular expression to capture a tag

Answers (4)

Related Questions