Reputation: 4471
I have a table where each row consist of an attribute which consist of html data with like this.
<div className="single_line"><p>New note example</p></div>
I need to omit the html tags and extract only the data inside the tags using sql query. Any idea on how to achieve this?. I tried out different regex but they didnt work.
Upvotes: 3
Views: 1411
Reputation: 422
There are 2 solutions based on mysql version.
If you are using MySQL 8.0 then you can use REGEXP_REPLACE() directly inside the select statement.
SELECT REGEXP_REPLACE('<div><p>New note example</p></div>', '(<[^>]*>)|( )', '');
If you are using MySQL 5.7 then you have to create a user define function in database to strip html tags.
DROP FUNCTION IF EXISTS fn_strip_html_tags;
CREATE FUNCTION fn_strip_html_tags( html_text TEXT ) RETURNS TEXT
BEGIN
DECLARE start,end INT DEFAULT 1;
DECLARE text_without_nbsp TEXT;
LOOP
SET start = LOCATE("<", html_text, start);
IF (!start) THEN RETURN html_text; END IF;
SET end = LOCATE(">", html_text, start);
IF (!end) THEN SET end = start; END IF;
SET text_without_nbsp = REPLACE(html_text, " ", " ");
SET html_text = INSERT(text_without_nbsp, start, end - start + 1, "");
END LOOP;
END
For example
SELECT fn_strip_html_tags('<div><p>New note example</p></div>');
Upvotes: 4