Why does this comparison return ? (Regular Expressions)

Question

1)mysql> SELECT 'aXbc' REGEXP '[a-dXYZ]';                 -> 1
2)mysql> SELECT 'aXbc' REGEXP '^[a-dXYZ]$';               -> 0    
3)mysql> SELECT 'aXbc' REGEXP '^[a-dXYZ]?$';              -> 0   // 0 or 1
4)mysql> SELECT 'aXbc' REGEXP '^[a-dXYZ]+$';              -> 1   // 1 or more
5)mysql> SELECT 'aXbc' REGEXP '^[a-dXYZ]*$';              -> 1   //0 or more

I'm confused about the second comparison. Doesn't it mean that a string which starts with [a-dXYZ] and ends with [a-dXYZ]?

Or does it mean that a string which starts with [a-dXYZ] and ends with [a-dXYZ] whose length is 1? If this is true, using ^ and $ symbols like '^....$' (leftmost and rightmost) eliminates all substrings and analyses only the string as a whole - is that right?

Note: This is not a how can I make it question. I want to understand a cause and effect relation.

Andreas Louv · Accepted Answer

^[a-dXYZ]$ will match exactly one character which it either: a, b, c, d, X, Y, Z and is the first character and the last. So your second observation is pretty straight on.

^ means start of input and $ means end of input.

You can use * to repeat 0 or more times or + to repeat 1 or more times:

^[a-dXYZ]+$

If you want to match a string starting and ending with [a-dXYZ], you could use the following:

^[a-dXYZ].*[a-dXYZ]$
#        ^ Match anything zero or more times

In some regex implementations you can use the m modifier which will make ^ and $ match beginning and end of lines, rater than input, consider the following Perl snippet:

s/^1+$/_/gm

It will replace all lines only consistent of ones:

% seq 1 11 | tac | perl -p0e 's/^1+$/_/gm'
_
10
9
8
7
6
5
4
3
2
_

Why does this comparison return ? (Regular Expressions)

Answers (1)

Related Questions