Vic K
Vic K

Reputation: 23

A regular expression in JS

Can you help me understand what the following regexp means:

(?:.*? rv:([\w.]+))?

So,

(?: //the pattern must be in a string, but doesn't return
. //any Unicode character except newline
* //zero or more times
? //zero or one time (how is *? different from just *)
rv: //just "rv:" apparently
[\w //any digit, an underscore, or any Latin-1 letter character
.] //...or any unicode character (are Latin-1 characters not Unicode?)
..))? //all that zero or one time

It's from "The Definitive Guide" and I hate that book. Some examples of what does and doesn't match the regexp would be much appreciated.

Upvotes: 1

Views: 112

Answers (1)

fge
fge

Reputation: 121710

The regex is:

(?:    # begin non capturing group
.*?    # any character, zero or more times, but peek and stop if the next char is
       # a space (" "); then look for
rv:    # literal "rv:", followed by
(      # begin capturing group
[\w.]  # any word character or a dot (the dot HAS NO special meaning in a character class),
+      # once or more,
)      # end capturing group
)      # end non capturing group
?      # zero or one time

*? is what is called a lazy quantifier, it forces the regex engine to peek the next character before swallowing a character -- it is used, overused and abused, and this is one case: since the next character is a literal space, it must be replaced with [^ ]* (anything which is NOT a space, zero or more times) which avoids the lookahead altogether.

Definitive. Right.

Upvotes: 2

Related Questions