Reputation: 81
I have a column in a MYSQL table that accepts TEXT which has HTML tags. Now I want to find all the URLs that have .png/.jpg in the HTML img src attribute.
For example:
Table name: mdl_course_sections; Column name: Summary; Data type of Summary: TEXT
Sample record:
Summary
----------
<table border="0">
<tbody>
<tr>
<td><img style="display: block; margin-left: auto; margin-right: auto;" src="http://xyz.au/abc.png" height="64" width="64" /></a></td>
<td> Learn about video slideshows.</td>
</tr>
<tr>
<td><img src="http://i1325.photobucket.com/albums/u630/teststudplane/plane-2.jpg" height="95" width="100" /></td>
<td> Explore how video can reinvent education.</td>
</tr>
</table>
Upvotes: 1
Views: 58
Reputation: 51411
You are going to struggle to do this inside SQL itself. Don't try, you'll end up pulling your hair out.
Instead, go through each of the entries, use an HTML parser to extract the src
attribute of each img
tag, and process from there.
The built-in loadHTML
method of the DOMDocument class may help, but it can be rather sensitive to poorly-formed HTML. You could also try something like Simple HTML DOM, which lets you use CSS-like selectors, kind of like how jQuery works.
Upvotes: 2