Reputation: 135
I saw this regular expression performed on an url:
$url = 'http://www.domain.com/';
preg_match('/(http)(.*?)\n/', $url, $matches);
I am not sure what the use of the question mark "?" is in this regex expression. According to regex manuals, the "?" is a meta character that is equivalent to {0,1}. Then, what is the point of having "?" after an * since * already represents {0,}
Can someone please enlighten me. Thanks.
Upvotes: 0
Views: 1168
Reputation: 92986
It has a different meaning when it follows another quantifier.
In this case it changes the matching behaviour of the preceding quantifier. The default behaviour is greedy and the the ?
changes it to "ungreedy".
"Greedy" means match as much as possible
"Ungreedy" means match as less as possible
See the article on regular-expression.info
For example:
a.+b
will match "aabxb" in aabxb
a.+?b
will match only "aab" in aabxb
See the example here on Regexr
You may be interested in my blog post about this topic: You do know Quantifiers. Really?
About your regex
preg_match('/(http)(.*?)\n/', $url, $matches);
I don't think it makes a difference here. The .
matches anything but newline characters by default (you can change this by adding a s
after the closing regex delimiter), so if the question mark is there or not, it will match only till the first \n
.
If you change the behaviour by using preg_match('/(http)(.*?)\n/s', $url, $matches);
, it will make a difference. .*\n
would match till the last \n
and .*?\n
will stop at the first \n
.
Upvotes: 6
Reputation: 152095
In this case, the question mark means a "stingy" match. It will stop matching as soon as the first \n
is encountered, while otherwise, it would gobble up intervening \n
s until the last.
More about greedy and stingy matching at http://www.perl.com/doc/FMTEYEWTK/regexps.html
Upvotes: 1