Johnathan Au
Johnathan Au

Reputation: 5362

How can I search for words in between html tags in PHP?

How can I search for words in between html tags? Say I have the given strings:

<span style="font-weight: bold;">
    <font size="4">Bearings<br /><br /></font>
</span>

<span style="font-weight: bold;">
    <font size="4">
        Scale Drawing &amp; Error in Measurement<br /><br />
    </font>
</span>    

<p align="left" class="MsoNormal" style="text-align: left;">
    <b/>
    <span lang="EN-GB">
        <font size="4" class="Apple-style-span">
            Solving Equations inc. Quadratic Formula
        </font>
    </span>
    </b> 
</p>

How can I search for the titles: Bearings, Scale Draw &ampl Error in Measurement and Solving Equations inc. Quadratic Formula? Also bearing in mind that the amount of html tags before and after the titles are dynamic meaning they could be anything and there could be any amount. Also, the titles themselves are dynamic, I don't know what they actually are. I'm searching for them. However, I know that they are at the start of the string, which means I can do something like search for the double quotations and then the right angled bracket "> and then the wildcard * and then the closing bracket and forward slash </

"> * </ 

Note that I have no idea about regex and I'm just stating that I could do a search on something like that since the VERY FIRST occurrence of </ implies the title is right before that.

Upvotes: 0

Views: 257

Answers (3)

Brett Wait
Brett Wait

Reputation: 397

Are you trying to do it at runtime? You could use javascript and the DOM innerHTML property. You say the HTML is dynamic and could vary, but if they are always in tags something like this might work for you.

<script type="text/javascript">
    txt=document.getElementsByTagname("span").innerHTML;
    document.write(txt);
</script>

See more here: http://www.w3schools.com/htmldom/dom_methods.asp

Upvotes: 1

kba
kba

Reputation: 19466

You could remove all HTML from your string using strip_tags and then search the text.

$data = '<h1 class="refname">strip_tags</h1>
<p class="para rdfs-comment">
   This function tries to return a string with all NUL bytes, HTML and PHP tags stripped
   from a given <em><code class="parameter">str</code></em>.  It uses the same tag stripping
   state machine as the <span class="function"><a href="function.fgetss.php" class="function">fgetss()</a></span> function.
  </p>';

print strip_tags($data);

The above will output

strip_tags

This function tries to return a string with all NUL bytes, HTML and PHP tags stripped
from a given str. It uses the same tag stripping
state machine as the fgetss() function.

Upvotes: 4

Gavriel
Gavriel

Reputation: 19237

I would suggest you to use a html parser, for example: http://simplehtmldom.sourceforge.net/ otherwise you will always miss out some case in your regular expressions.

Upvotes: 1

Related Questions