eli.rodriguez
eli.rodriguez

Reputation: 490

Regular expression to capture a tag

I have the following html text and in javascript i need to caputure all the tags "p" that have a class "page-break" and then replace it for any text.

I need use regular expression beacuse this html text is going to be processed like a text not like a DOM estrucutre

<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Praesent pellentesque tincidunt adipiscing</p>

<p class="page-break">break</p>

<p>Suspendisse a velit at diam facilisis
egestas sit amet a lectus.</p>

<p class="page-break">other</p>

<p>Donec tristique placerat massa vitae hendrerit. Maecenas nec
massa adipiscing sem venenatis vehicula. Suspendisse mattis pretium
libero quis dignissim. Nulla volutpat imperdiet vehicula. Donec ut
tristique neque.</p>

What prevent me to use a dom parser is than i plan to insert a not valid html element i plan transform the previus HTML into this, i need to parse firt like a text to skip html validation and then paste it like this

 <div class="pag visible">
 <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit.
    Praesent pellentesque tincidunt adipiscing</p>
 </div>
 <div class="pag">   
    <p>Suspendisse a velit at diam facilisis
    egestas sit amet a lectus.</p>
 </div>
 <div class="pag">   
    <p>Donec tristique placerat massa vitae hendrerit. Maecenas nec
    massa adipiscing sem venenatis vehicula. Suspendisse mattis pretium
    libero quis dignissim. Nulla volutpat imperdiet vehicula. Donec ut
    tristique neque.</p>
 </div>

as you can see every ".page-break" is replace ir

Upvotes: 1

Views: 189

Answers (4)

ted
ted

Reputation: 5329

// your content
var content = '<p>Lorem ips...';
// to match any <p> with correspondent class
var regex = /(<p class.?=.?"page-break">.*<\/p>)+/g;
// to replace it with whatever you need:
content.replace(regex, "<p>MY TEXT HERE</p>");

Example

Upvotes: 1

Vaman Kulkarni
Vaman Kulkarni

Reputation: 3451

It is not advisable to parse HTML with regex. You can use XPath for fetching all the <p> with specified criteria and iterate over the returned list and update the textContent for each <p> as shown in below snippet.

var pList = document.evaluate("//p[@class='page-break']", document, null, XPathResult.ANY_TYPE, null);   
var item = pList.iterateNext();  
while (item) {  
    item.textContent = "New Text";
    item = pList.iterateNext();  
}

Explanation

//p[@class='page-break'] will fetch all the <p> elements with class='page-break'. document.evaluate function will return you object of type XPathResult. Using interateNext() function you can get its element. You can set new text using textContent property.

Upvotes: 0

Oleg V. Volkov
Oleg V. Volkov

Reputation: 22421

Don't use regexp to parse HTML. Use DOM instead. If you have plain string, create a DocumentFragment and assign it to its .innerHTML to get DOM.

Find your p tags with getElementsByTagName, check their .className and act accordingly.

Upvotes: 4

Rich Andrews
Rich Andrews

Reputation: 4188

Have you thought of using JQuery?

$('p').hasClass('page-break').html('replacement value goes here');

this will replace the contents of <p> with "replacement value goes here"

or $('p').hasClass('page-break').remove(); will remove the <p> element entirely.

Upvotes: 0

Related Questions