Jason
Jason

Reputation: 2727

How do you find words with hyphens in a MYSQL REGEXP query using word boundries?

I have a MYSQL query to try to find words with hyphens. I am using the MYSQL word boundary.

SELECT COUNT(id) 
AS count 
FROM table 
WHERE (name REGEXP '^[[<:]]some-words-with-hyphens[[:>:]]/')

This seems to work, although the following does not (see the - after the word "hyphens"):

SELECT COUNT(id) 
AS count 
FROM table 
WHERE (words REGEXP '^[[<:]]some-words-with-hyphens-[[:>:]]/')

I tried to escape the -'s with \- but that did not seem to change the result. I also tried to put the - in brackets like [-], but that did not seem to change the result.

What would be the proper way to write this query with the understanding that hyphens will be within and possibly at the end of the "word"?

Upvotes: 3

Views: 2106

Answers (2)

Tim Pietzcker
Tim Pietzcker

Reputation: 336128

Assuming that some-words-with-hyphens is actually a regex and not some verbatim text, you could simply add an optional - at the end of the regex in order to match a trailing dash if it's present:

WHERE (name REGEXP '^[[<:]]some-words-with-hyphens[[:>:]]-?/')

@eggyal has already explained why the word boundary matches before that hyphen.

Upvotes: 0

eggyal
eggyal

Reputation: 125855

As documented under Regular Expressions:

A regular expression for the REGEXP operator may use any of the following special characters and constructs:

[ deletia ]

  • [[:<:]], [[:>:]]

    These markers stand for word boundaries. They match the beginning and end of words, respectively. A word is a sequence of word characters that is not preceded by or followed by word characters. A word character is an alphanumeric character in the alnum class or an underscore (_).

    mysql> SELECT 'a word a' REGEXP '[[:<:]]word[[:>:]]';   -> 1
    mysql> SELECT 'a xword a' REGEXP '[[:<:]]word[[:>:]]';  -> 0
    

Since - and / are both non-word characters, the [[:>:]] construct does not match the point between them.

It's not clear why you're using these constructs at all, as the following ought to do the trick:

words REGEXP '^some-words-with-hyphens-/'

Indeed, it's not clear why you're even using regular expressions in this case, as simple pattern matching can achieve the same:

words LIKE 'some-words-with-hyphens-/%'

Upvotes: 3

Related Questions