Reputation: 6552
As hard as I try, PREG and I don't get along, so, I am hoping one of you PHP gurus can help out ..
I have some HTML source code coming in to a PHP script, and I need specific items stripped out/removed from the source code.
First, if this comes in as part of HTML (could be multiple instances):
<SPAN class=placeholder title="" jQuery1262031390171="46">[[[SOMETEXT]]]</SPAN>
I want it converted into simply [[[SOMETEXT]]]
Note that the prefix will always be (I think):
<SPAN class=placeholder
.. and suffix will always be
</SPAN>
(yes, capital SPAN), but the title="" and jQuery###="#" pieces may be different. [[[SOMETEXT]]] could be anything. I essentially want the SPAN tag removed.
Next, if this comes as part of HTML (also could be multiple instances):
<span style="" class="placeholder" title="">[[[SOMETEXT]]</span>
.. same thing - just want the [[[SOMETEXT]]] part to remain. I think piece will always be prefix, and (in this case, lowercase span tags) will be suffix.
I understand this may probably take 2 PREG commands, but would like to be able to pass in the html text into a function and get a cleaned/stripped version, something like this:
$dirty_text = $_POST['html_text'];
$clean_text = strip_placeholder_spans($dirty_text);
function strip_placeholder_spans( $in_text ) {
// all the preg magic happens here, and returns result
}
ADDED/UPDATED FOR CLARITY
Ok, getting some good feedback, and getting close. However, to make it clearer, here is an example. I want to sent this text into the function strip_placeholder_spans():
<blockquote>
<h2 align="center">Firefox: <span class="placeholder" title="">[[[ITEM1]]]</span></h2>
<h2 align="center">IE1:<SPAN class=placeholder title="" jQuery1262031390171="46">[[[ITEM2]]]</SPAN>
</h2>
<h2 align="center">IE2:<SPAN class=placeholder title="" jQuery1262031390412="52">[[[ITEM3]]]</SPAN>
</h2>
<h2 align="center"><br><font face="Arial, Helvetica, sans-serif">COMPLETE</font></h2>
<p align="center">Your Text Can Go Here</p>
<p align="center"><a href="javascript:self.close()">Close this Window</a></p>
<p align="center"><br></p>
<p align="center"><a href="javascript:self.close()"><br></a></p></blockquote>
<p align="center"></p>
and when it comes back, it should be this:
<blockquote>
<h2 align="center">Firefox: [[[ITEM1]]]</h2>
<h2 align="center">IE1:[[[ITEM2]]]</h2>
<h2 align="center">IE2:[[[ITEM3]]]</h2>
<h2 align="center"><br><font face="Arial, Helvetica, sans-serif">COMPLETE</font></h2>
<p align="center">Your Text Can Go Here</p>
<p align="center"><a href="javascript:self.close()">Close this Window</a></p>
<p align="center"><br></p>
<p align="center"><a href="javascript:self.close()"><br></a></p></blockquote>
<p align="center"></p>
Upvotes: 1
Views: 421
Reputation: 53871
Step one: Remove regular expressions from your toolbox when dealing with HTML. You need a parser.
Step two: Download simple_html_dom for php.
Step three: Parse
$html = str_get_html('<SPAN class=placeholder title="" jQuery1262031390171="46">[[[SOMETEXT]]]</SPAN>');
$spanText = $html->find('span', 1)->innerText;
Step four: Profit!
Edit
$html->find('span.placeholder', 1)->tag, $matches);
will return what you want. It looks for class=placeholder.
Upvotes: 1
Reputation: 38318
Use an HTML parse. This is the most robust solution. The following code will work for the two code examples you posted:
$s= <<<STR
<span style="" class="placeholder" title="">[[[SOMETEXT]]</span>
Some Other text & <b>Html</b>
<SPAN class=placeholder title="" jQuery1262031390171="46">[[[SOMETEXT]]]</SPAN>
STR;
preg_match_all('/\<span[^>]+?class="*placeholder"*[^>]+?>([^<]+)?<\/span>/isU', $s, $m);
var_dump($m);
Using regular expressions results in very focused code. This example will only handle very specific HTML and well-formed HTML. For instance, it won't parse <span class="placeholder">some text < more text</span>
. If you have control over the source HTML this may be good enough.
Upvotes: 1
Reputation: 3960
I think this should solve your poble
function strip_placeholder_spans( $in_text ) {
preg_match("/>(.*?)<\//", $in_text, $result);
return $result[1]; }
Upvotes: 1