Reputation: 14412
I'm trying to use regex to add a span to the first word of content for a page, however the content contains HTML so I am trying to ensure just a word gets chosen. The content changes for every page.
Current script is:
preg_match('/(<(.*?)>)*/i',$page_content,$matches);
$stripped = substr($page_content,strlen($matches[0]));
preg_match('/\b[a-z]* \b/i',$stripped,$strippedmatch);
echo substr($page_content, 0, strlen($matches[0])).'<span class="h1">'.$strippedmatch[0].'</span>'.substr($stripped, strlen($strippedmatch[0]));
However if the $page_content is
<p><span class="title">This is </span> my title!</p>
Then my regex thinks the first word is "span" and adds the tags around that.
Is there any way to fix this? (or a better way to do it).
Upvotes: 0
Views: 2276
Reputation: 383736
You shouldn't be using regex for this, but if you insist, you can try something like this:
<?php
$texts = array(
'<p><span class="title">This is </span> my title!</p>',
'<1> <2> <3> blah blah <4> <5> blah',
'garbage <1> <2> real stuff begins <3> <4>',
);
foreach ($texts as $text) {
print preg_replace('/(>\s*)(\w+)/', '\1{{\2}}', $text, 1)."\n";
}
?>
This prints:
<p><span class="title">{{This}} is </span> my title!</p>
<1> <2> <3> {{blah}} blah <4> <5> blah
garbage <1> <2> {{real}} stuff begins <3> <4>
Upvotes: 0
Reputation: 33749
This seems to work...
(?<=\>)\b\w*\b|^\w*\b
If you wanna allow spaces in front also (remember to trim the resulting string):
(?<=>)\s*\b\w*\b|^\s*\w*\b
Upvotes: 1
Reputation: 5523
If i understand you correct you want a tag around the first word (none tag) with regex you could get that by using this regex
$code = preg_replace('/^(<.+?>\s*)+?(\w+)/i', '\1<span class="h1">\2</span>', $code);
this one just loops over the tags and waits until it finds text outside the tags
Upvotes: 0