SuperSpy
SuperSpy

Reputation: 1314

How to replace 'innerHTML' of tag that has specific class (the nth occurrence) (using regex)?

What I am trying to achieve

I am trying to replace the 'innerHTML' of any (in my case) html tag, that has a specific class assigned to it, within a file_get_contents() string, without altering the other content. Later I will create a file (with file_put_contents()).

I am specifically trying to avoid the use of DOMDocuments, Xpath, simple_html_dom because these alter the formatting of a document.

The class markers are just a way to mark the elements in the source, like lightbox does. Marking with a class seemed most elegant, but maybe marking elements in a different way makes the solution easier? I doubt it will make a difference though.


The code should also match when:

It is not necessary but it would be amazing if it even matches if:

What I have tried

(in counter-chronological order)

1 - trying to work with the following fucntion I've found in other so answers and php.net:

function preg_replace_nth($pattern, $replacement, $subject, $nth=1) {
    return preg_replace_callback($pattern,
        function($found) use (&$pattern, &$replacement, &$nth) {
                $nth--;
                if ($nth==0) return preg_replace($pattern, $replacement, reset($found) );
                return reset($found);
        },$subject ,$nth  );
}

I am not a regex expert and in combination with the php functions it becomes, for me, very difficult, that's why I ask for help. (I've been working on this for an hour or 8.)

I tried feeding it the following regex pattern (did many small alterations:

  1  '#(?<=class=\"classToMatch\".*?>).*?(?=</)#';

For the last 30 alterations it keeps returning:

Warning: preg_replace_callback(): Compilation failed: lookbehind assertion is not fixed length at offset xx

Things I realise that are perhaps problematic for regex:


2 - working with simple_html_dom and DOMDocument

First I was delighted to see that it worked, but when I opened the source code of the edited document I was horrified because it deleted a lot of formatting.

This was the working code and should be fine for anyone working with html documents with little php and javascript.

$nth = 0;              // nth occurrence (starts with 0)
$replaceWith = '';     // replacement string

$dom = new DOMDocument();
@$dom->loadHTMLFile("source.php");

// find all elements with specific class
$finder = new DomXPath($dom);
$nodes = $finder->query("//*[contains(concat(' ', normalize-space(@class), ' '), ' classname ')]");

if (!is_int($nodes->length) || $nodes->length < 1) die('No element found');

$nodeToChange = $nodes->item($nth);

$nodeToChange ->removeChild($nodeToChange ->firstChild);
$fragment = $dom->createDocumentFragment();
$fragment->appendXML($replaceWith);
$lentNodeToEdit->appendChild($fragment);

$dom->saveHTMLFile("test.php");

3 - things with strpos etc. and I am currently considering returning to these functions.

Upvotes: 0

Views: 747

Answers (1)

Michael Nero
Michael Nero

Reputation: 1416

The following regex might be helpful to you:

<(?<tag>\w*)\sclass=\"lent-editable\">(?<text>.*)</\k<tag>>

You will need to find the group name "text", which is the inner HTML you want to replace.

Upvotes: 1

Related Questions