David542
David542

Reputation: 110083

RegExp in mysql for field

I have the following query:

SELECT item from table

Which gives me:

<title>Titanic</title>

How would I extract the name "Titanic" from this? Something like:

SELECT re.find('\>(.+)\>, item) FROM table

What would be the correct syntax for this?

Upvotes: 2

Views: 285

Answers (4)

Michael - sqlbot
Michael - sqlbot

Reputation: 179004

XML shouldn't be parsed with regexes, and at any rate MySQL only supports matching, not replacement.

But MySQL supports XPath 1.0. You should be able to simply do this:

SELECT ExtractValue(item,'/title') AS item_title FROM table;

https://dev.mysql.com/doc/refman/5.6/en/xml-functions.html

Upvotes: 0

jpw
jpw

Reputation: 44871

As pointed out in the informative answer by George Bahij MySQL lacks this functionality so the options would be to either extend the functionality using udfs etc, or use the available string functions, in which case you could do:

SELECT 
  SUBSTR(
    SUBSTRING_INDEX(
      SUBSTRING_INDEX(item,'<title>',2)
      ,'</title>',1) 
    FROM 8
  )
from table

Or if the string you need to extract from always is on the format <title>item</title> then you could simple use replace: replace(replace(item, '<title>', ''), '</title>','')

Upvotes: 1

George Bahij
George Bahij

Reputation: 627

By default, MySQL does not provide functionality for extracting text using regular expressions. You can use REGEXP to find rows that match something like >.+<, but there is no straightforward way of extracting the captured group without some additional effort, such as:

  • using a library like lib_mysqludf_preg
  • writing your own MySQL function to extract matched text
  • performing regular string manipulation
  • using the regex functionality of whatever environment you're using MySQL from (e.g. PHP's preg_match)
  • reconsidering your need for regular expressions entirely. If you know that all your rows contain a <title> tag, for instance, it may be a better idea to simply use "normal" string functions such as SUBSTRING

Upvotes: 2

Daniel Waghorn
Daniel Waghorn

Reputation: 2985

This regex: <\w+>.+</\w+> will match content in tags.

Your query should be something like:

SELECT * FROM `table` WHERE `field` REGEXP '<\w+>.+</\w+>';

Then if you're using PHP or something similar you could use a function like strip_tags to extract the content between the tags.

Upvotes: 0

Related Questions