Reputation: 650
Given an HTML string like this one:
Lorem ipsum dolor sit amet, <a href="#">consectetuer adipiscing</a>
elit, <strong>tincidunt</strong> ut volutpat.
How do I surround all the words with <span>
elements, so it becomes:
<span>Lorem</span> <span>ipsum</span> <span>dolor</span> <span>sit</span>
<span>amet,</span> <a href="#"><span>consectetuer</span> <span>adipiscing</span></a>
<span>elit,</span> <strong><span>tincidunt</span></strong> <span>ut</span>
<span>volutpat.</span>
Upvotes: 1
Views: 3181
Reputation: 435
A simpler approach
preg_replace('([a-zA-Z.,!?0-9]+(?![^<]*>))', '<span>$0</span>', '{{your data}}');
It surrounds with <span>
all words from your vocabulary [a-zA-Z.,!?0-9]+
except words in <brackets>
.
Now it seems to be easy to change your vocabulary if needed, e.g. if you don't want single punctuation to became surrounded, etc.
Upvotes: 4
Reputation: 5700
If @daftcoder's solution works for you that's great, but it does fail if you have entities (< etc) in your code. I couldn't find any other cases where it failed.
If that matters, you can use DOM manipulation in PHP. I know this is way more complicated, but it should work in more cases than the simple regex.
The walk and doReplace functions are converted from JS to PHP from the answer to another question. ( Surrounding individual words inside HTML text with SPAN tags? )
<?php
echo wrap_words('span', 'Lorem ipsum dolor sit amet, <a href="#">consectetuer adipiscing</a> elit, <strong>tincidunt</strong> ut volutpat.');
function wrap_words($tag, $text) {
$document = new DOMDocument();
$fragment = $document->createDocumentFragment();
$fragment->appendXml($text);
walk($tag, $fragment);
$html = $document->saveHtml($fragment);
// using saveHTML with a documentFragment can leave an invalid "<>"
// at the beginning of the string - remove it
return preg_replace('/^<>/', '', $html);
}
function walk($tag, $root)
{
if ($root->nodeType == XML_TEXT_NODE)
{
doReplace($tag, $root);
return;
}
$children = $root->childNodes;
for ($i = $children->length - 1; $i >= 0; $i--)
{
walk($tag, $children->item($i));
}
}
function doReplace($tag, $text)
{
$fragment = $text->ownerDocument->createDocumentFragment();
$fragment->appendXML(preg_replace('/\S+/', "<{$tag}>\$0</{$tag}>", $text->nodeValue));
$parent = $text->parentNode;
$children = $fragment->childNodes;
for ($i = $children->length - 1; $i >= 0; $i--)
{
$parent->insertBefore($children->item($i), $text->nextSibling);
}
$parent->removeChild($text);
}
Upvotes: 1
Reputation: 724
I tried this, think this is what you are looking for:
$result = preg_replace("/(<[^>]+>)?\\w*/us", "<span>$0</span>", $searchText);
This is the input
Lorem ipsum dolor sit amet, <a href="#">consectetuer adipiscing</a>elit, <strong>tincidunt</strong> ut volutpat.
And this is the output
<span>Lorem</span> <span>ipsum</span> <span>dolor</span> <span>sit</span> <span>amet</span>, <span><a href="#">consectetuer</span> <span>adipiscing</span><span></a></span><span>elit</span>, <span><strong>tincidunt</span><span></strong></span> <span>ut</span> <span>volutpat</span>.
Upvotes: 1