Samin
Samin

Reputation: 650

Surround all words with spans in php

Given an HTML string like this one:

Lorem ipsum dolor sit amet, <a href="#">consectetuer adipiscing</a>
elit, <strong>tincidunt</strong> ut volutpat.

How do I surround all the words with <span> elements, so it becomes:

<span>Lorem</span> <span>ipsum</span> <span>dolor</span> <span>sit</span>
<span>amet,</span> <a href="#"><span>consectetuer</span> <span>adipiscing</span></a>
<span>elit,</span> <strong><span>tincidunt</span></strong> <span>ut</span>
<span>volutpat.</span>

Upvotes: 1

Views: 3181

Answers (3)

daftcoder
daftcoder

Reputation: 435

A simpler approach

preg_replace('([a-zA-Z.,!?0-9]+(?![^<]*>))', '<span>$0</span>', '{{your data}}');

It surrounds with <span> all words from your vocabulary [a-zA-Z.,!?0-9]+ except words in <brackets>.
Now it seems to be easy to change your vocabulary if needed, e.g. if you don't want single punctuation to became surrounded, etc.

Upvotes: 4

mcrumley
mcrumley

Reputation: 5700

If @daftcoder's solution works for you that's great, but it does fail if you have entities (&lt; etc) in your code. I couldn't find any other cases where it failed.

If that matters, you can use DOM manipulation in PHP. I know this is way more complicated, but it should work in more cases than the simple regex.

The walk and doReplace functions are converted from JS to PHP from the answer to another question. ( Surrounding individual words inside HTML text with SPAN tags? )

<?php

echo wrap_words('span', 'Lorem ipsum dolor sit amet, <a href="#">consectetuer adipiscing</a> elit, <strong>tincidunt</strong> ut volutpat.');

function wrap_words($tag, $text) {
    $document = new DOMDocument();
    $fragment = $document->createDocumentFragment();
    $fragment->appendXml($text);
    walk($tag, $fragment);
    $html = $document->saveHtml($fragment);
    // using saveHTML with a documentFragment can leave an invalid "<>"
    // at the beginning of the string - remove it
    return preg_replace('/^<>/', '', $html);
}

function walk($tag, $root)
{
    if ($root->nodeType == XML_TEXT_NODE)
    {
        doReplace($tag, $root);
        return;
    }
    $children = $root->childNodes;
    for ($i = $children->length - 1; $i >= 0; $i--)
    {
        walk($tag, $children->item($i));
    }
}

function doReplace($tag, $text)
{
    $fragment = $text->ownerDocument->createDocumentFragment();
    $fragment->appendXML(preg_replace('/\S+/', "<{$tag}>\$0</{$tag}>", $text->nodeValue));
    $parent = $text->parentNode;
    $children = $fragment->childNodes;
    for ($i = $children->length - 1; $i >= 0; $i--)
    {
        $parent->insertBefore($children->item($i), $text->nextSibling);
    }
    $parent->removeChild($text);
}

Upvotes: 1

hugohabel
hugohabel

Reputation: 724

I tried this, think this is what you are looking for:

$result = preg_replace("/(<[^>]+>)?\\w*/us", "<span>$0</span>", $searchText);

This is the input

Lorem ipsum dolor sit amet, <a href="#">consectetuer adipiscing</a>elit, <strong>tincidunt</strong> ut volutpat.

And this is the output

<span>Lorem</span> <span>ipsum</span> <span>dolor</span> <span>sit</span> <span>amet</span>, <span><a href="#">consectetuer</span> <span>adipiscing</span><span></a></span><span>elit</span>, <span><strong>tincidunt</span><span></strong></span> <span>ut</span> <span>volutpat</span>.

Upvotes: 1

Related Questions