Reputation: 23510
I have an HTML string and I'd like to make some text formatting on its pure text content. I mean, I'd like to extract anything that is a text and not included into the tag braces. But... I've planned to use a DOMDocument, but I don't know the tags I'm looking for, nor their ID.
For example, I can Have that string
<p><i>some tex<span class="aclass">t</span> in the document.</i>Whoooa <img src="anImage.png" /></p>
And I'd like to format the "some text in the document.Whoooa " string before reinjecting the whole text formatted with the original tags in the page. For example to put a space after the point and deleting the ending space.
How would I do that ?
Upvotes: 3
Views: 983
Reputation: 197659
I have started to create a class called TextRange
that gives an easy interface to textnodes as a single string representation of a certain DOMDocument
part.
You need to find out where the string needs to be changed, and the TextRange
class can then split nodes if necessary. I have put a lengthy explanation of it in the following two questions:
The first one also contains a pretty raw TextRangeTrimmer
class which can remove whitespaces at the beginning and end of such a TextRange
.
As you only modify text node values, the original tags are always preserved. You might need to clean up unused (empty) tags later yourself depending on your use.
It works based on DOMDocument
and accepts a parent DOMElement
(the range will be all textnode children), an xpath query result (DOMNodeList
) or just an array of textnode elements.
Upvotes: 1
Reputation: 59699
Use strip_tags!
$str = '<p><i>some tex<span class="aclass">t</span> in the document.</i>Whoooa <img src="anImage.png" /></p>';
echo strip_tags( $str);
This will output:
string(33) "some text in the document.Whoooa "
Then, for the rest of your question:
// Put a space after the point
$str = preg_replace( '/\.([^ ])/', '. $1', $str);
// and deleting the ending space.
$str = rtrim( $str, ' ');
Upvotes: 0
Reputation: 917
If at all possible doing it client side is easier with jQuery, it's specifically made for easy dom manipulation. In general you're going to need to use preg_match and or an xml parser. There's a few dom parsers I think but I don't remember if any are included with php.
Upvotes: 2