Siphon
Siphon

Reputation: 1061

Find email links in HTML using MySQL

The HTML is stored within MySQL. What I need to do is find out if there are href links containing an email AND do not have mailto: prefixed to the email. Can this be done in MySQL?

This should be found by the query:

... <a href="[email protected]">[email protected]</a> ...

but not this one:

... <a href="mailto:[email protected]">[email protected]</a> ...

Note: I can use PHP/Python and parse the HTML if required, but I'm hoping there is a faster/easier way to do this by only using MySQL.

Bonus Question:

Can you use the above query in an update to add the missing mailto?

Upvotes: 0

Views: 91

Answers (1)

spaniol6
spaniol6

Reputation: 606

You can use MySQL REGEXP to find if there are any emails without the mailto.

SELECT * FROM 'table' WHERE 'column' REGEXP 'href\=\"[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\"'

I believe that regex should match anything in this format: href="[email protected]" But it won't match: href="mailto:[email protected]"

Upvotes: 1

Related Questions