Webeng
Webeng

Reputation: 7113

Understanding this regex example

The following example is javascript code that uses regex:

// Make a regular expression object that matches
// a JavaScript string.
var my_regexp = /"(?:\\.|[^\\\"])*"/g;

My current understanding of the previous regex example (/"(?:\\.|[^\\\"])*"/g) is as follows:

Global (match multiple times; the precise meaning of this varies with the method)

So basically it's meaning depends on the functions I use the regex variable in.

MY FIRST DOUBT: I have seen a different syntax for non-capturing groups: (?:...)? which has a ? at the end. Is there a difference between that and no ? at the end? is the * replacing the ? to make it equate to zero or more rather than zero or one?

MY LAST DOUBT: The last thing I don't understand here are the four characters: \\.|. \\ equates to a backslash, and . I believe equates to any character, and | I'm not sure. I'm pretty sure that the inside of the non-capturing group isn't specifying to search for characters in a string that have \[anything]|[anything except for \ and "] because the comments in the example above literally says: // Make a regular expression object that matches a Javascript string.

QUESTION: Would anyone be able to clarify the doubts I am having above?

Upvotes: 0

Views: 127

Answers (1)

SamWhan
SamWhan

Reputation: 8332

A group starting with (?: is, as you say a non capturing group. It means that the part it matches, isn't stored in a capture group, available for later retrieval. Making a group optional with a ?, means the part it's supposed to match, isn't necessary for the whole regex to match. It's not uncommon for non-capturing groups to be optional.

The alternation feature matches the sequence on either side of it, starting by trying with the left side.

So your regex matches a string

  • starting with a ", then either
  • an escaped character or
  • a character that isn't a \ or a ".
  • finally ending with a ".

PS. You don't need to escape the " inside the character class. /"(?:\\.|[^\\"])*"/g is OK.

Upvotes: 2

Related Questions