Reputation: 5362
How can I search for words in between html tags? Say I have the given strings:
<span style="font-weight: bold;">
<font size="4">Bearings<br /><br /></font>
</span>
<span style="font-weight: bold;">
<font size="4">
Scale Drawing & Error in Measurement<br /><br />
</font>
</span>
<p align="left" class="MsoNormal" style="text-align: left;">
<b/>
<span lang="EN-GB">
<font size="4" class="Apple-style-span">
Solving Equations inc. Quadratic Formula
</font>
</span>
</b>
</p>
How can I search for the titles: Bearings, Scale Draw &l Error in Measurement and Solving Equations inc. Quadratic Formula? Also bearing in mind that the amount of html tags before and after the titles are dynamic meaning they could be anything and there could be any amount. Also, the titles themselves are dynamic, I don't know what they actually are. I'm searching for them. However, I know that they are at the start of the string, which means I can do something like search for the double quotations and then the right angled bracket "> and then the wildcard * and then the closing bracket and forward slash </
"> * </
Note that I have no idea about regex and I'm just stating that I could do a search on something like that since the VERY FIRST occurrence of </
implies the title is right before that.
Upvotes: 0
Views: 257
Reputation: 397
Are you trying to do it at runtime? You could use javascript and the DOM innerHTML property. You say the HTML is dynamic and could vary, but if they are always in tags something like this might work for you.
<script type="text/javascript">
txt=document.getElementsByTagname("span").innerHTML;
document.write(txt);
</script>
See more here: http://www.w3schools.com/htmldom/dom_methods.asp
Upvotes: 1
Reputation: 19466
You could remove all HTML from your string using strip_tags
and then search the text.
$data = '<h1 class="refname">strip_tags</h1>
<p class="para rdfs-comment">
This function tries to return a string with all NUL bytes, HTML and PHP tags stripped
from a given <em><code class="parameter">str</code></em>. It uses the same tag stripping
state machine as the <span class="function"><a href="function.fgetss.php" class="function">fgetss()</a></span> function.
</p>';
print strip_tags($data);
The above will output
strip_tags
This function tries to return a string with all NUL bytes, HTML and PHP tags stripped
from a given str. It uses the same tag stripping
state machine as the fgetss() function.
Upvotes: 4
Reputation: 19237
I would suggest you to use a html parser, for example: http://simplehtmldom.sourceforge.net/ otherwise you will always miss out some case in your regular expressions.
Upvotes: 1