Reputation: 71
I'm using a PHP function to split text into blocks of max N chars. Once each block is "treated" somehow, it is concatenated back again. The problem is that the text can be HTML... and if the split occurs between open html tags, the "treatment" gets spoiled. Can someone give a hint about breaking text only between closed tags?
Requirements:
<body>
tags<HTML>
tags<head>
tagsAdding a sample: (max block length = 173)
<div class="myclass">
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Integer dapibus sagittis lacus quis cursus.
</div>
<div class="anotherclass">
Nulla ligula felis, adipiscing ac varius et, sollicitudin eu lorem. Sed laoreet porttitor est, sit amet vestibulum massa pretium et. In interdum auctor nulla, ac elementum ligula aliquam eget
</div>
In the text above, given 173 chars as the limit, text would break @ "adipiscing", however that would break the <div class="anotherclass">
. In this case, the split shall occur at the first closing, although being shorter the the max limit.
Upvotes: 1
Views: 2302
Reputation: 1017
Hmmm I've used a code where I had to split the copy entered by a WYSIWYG and wanted to retrieve the first paragraph from it. Its little dodgy but did the trick for me. If you wanted to add in show "n" then you could add that to the "intro" var using substr. Hope this starts you off :-|
function break_html_description_to_chunks($description = null)
{
$firstParaEnd = strpos($description,"</p>");
$firstParaEnd += 4;
$intro = substr($description, 0, $firstParaEnd);
$body = substr($description, $firstParaEnd, strlen($description));
$temp = array("intro" => $intro, "body" => $body);
return $temp;
}
Upvotes: 0
Reputation: 23774
The "correct" way would be to parse the HTML and perform the shortening operations on its text nodes. In PHP5 you could use the DOM extension, and specifically DOMDocument::loadHTML()
.
Upvotes: 1