Edison
Edison

Reputation: 4281

Understanding The regular Expression

I am working on the following regular expression and so far have only understood a part of it. Following is regular expression for 10-digit North American Phone Number format

^(\(\d{3}\)|^\d{3}[.-]?)?\d{3}[.-]?\d{4}$
  1. Caret character at the beginning and dollar at the last will make this regular expression to allow only 10-digit number?

  2. Second caret character is there because it means start of next 3 digit number and it's omission does what I have tried to remove it but not found any changing?

  3. Vertical Bar or pipe character does what I am not getting it?

  4. First Backward slash is for paranethesis and Second for 3 digit number.

Upvotes: 0

Views: 86

Answers (2)

ΩmegaMan
ΩmegaMan

Reputation: 31606

^(\(\d{3}\)|^\d{3}[.-]?)?\d{3}[.-]?\d{4}$
  • ^ - Beginning of the line to look for a match, it is an anchor.
  • ( - Begin a match capture to ultimatly get text/info to extract.
  • \( - Look for a literal parenthesis.
  • \d - Regex Replacement which specifies look for a digit or a number
  • {3} - a quantifier expressed on the previous item, in this case it says find three digits. Could be rewritten as \d\d\d. So consider this one statement \d{3} means find three digits.
  • \) look for a literal ending parenthesis.
  • | - Regex Or, So far its Match 3 numbers within two parenthesis, now do Or....
  • ^ - Beginning of line again, Suggestion the anchor should not have to be specified this way, just left one at the beginning then do the Or as a submatch.
  • \d{3} - Same as above.
  • [ - Denotes a the beginning of a set of characters. This is like a literal \ situation, but for multiple characters. Everything within a set [ ] could be any one of those characters.
  • .- The literal characters of a period (.) and a dash. Not to be confused with a . in other contexts which at that time would mean any character or whitespace. In a set, like it is now, it means just a literal period.
  • ] - End of the set. The set is [.-] which says a set of two possible characters, only one will mathc.
  • ? A quantifier saying that the previous declaration, the set, may match or may not match.
  • ) End of the submatch capture.
  • ? The submatch may or may not occur. Again I believed the user errored in the parenthesis.

.... Same patterns as above, same explanations.

  • $ End of line anchor. This denotes that the whole text sent end must fit within the pattern or fail.

Upvotes: 2

user557597
user557597

Reputation:

 # ^(?:\(\d{3}\)|\d{3}[.-]?)?\d{3}[.-]?\d{4}$

 # Optional area code
 ^                             # Beginning of string
 (?:                           # Cluster group start
      \( \d{3} \)                   # '(' 3-digit area code ')'
   |                              # or, 
      \d{3} [.-]?                   # just 3-digit area code, optional dot or dash
 )?                            # Cluster group end
 # 7 digit phone number
 \d{3} [.-]? \d{4}             # ( 3-digits, optional dot or dash, 4-digits )
 $                             # End of string

Upvotes: 1

Related Questions