WebWanderer
WebWanderer

Reputation: 10867

Regex: Non-Capturing Group in Non-Numeric?

I am trying to test that a timestamp (let's use HH:MM:ss as an example) does not have any numeric characters surrounding it, or to say that I would like to check for the presence of a non-numeric character before and after my timestamp. The non-numeric character does not need to exist, but no numeric character should exist directly before nor directly after. I do not want to capture this non-numeric character. How can I do this? Should I use "look-arounds" or non-capturing groups?

Fill in the blank + (2[0-3]:[0-5][0-9]:[0-5][0-9]|[0-1][0-9]:[0-5][0-9]:[0-5][0-9]) + Fill in the blank

Thanks!

Upvotes: 1

Views: 311

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626903

I would like to check for the presence of a non-numeric character before and after my timestamp. The non-numeric character does not need to exist, but no numeric character should exist directly before nor directly after. I do not want to capture this non-numeric character.

The best way to match such a timestamp is using lookarounds:

(?<!\d)(2[0-3]:[0-5][0-9]:[0-5][0-9]|[0-1][0-9]:[0-5][0-9]:[0-5][0-9])(?!\d)

The (?<!\d) fails a match if there is a digit before the timestamp and (?!\d) fails a match if there is a digit after the timestamp.

If you use

\D*(2[0-3]:[0-5][0-9]:[0-5][0-9]|[0-1][0-9]:[0-5][0-9]:[0-5][0-9])\D*

(note that (?:...) non-capturing groups only hamper the regex engine, the patterns inside will still match, consume characters), you won't get overlapping matches (if there is a timestamp right after the timestapmp). However, this is a rare scenario I believe, so you still can use your regex and grab the value inside capture group 1.

Also, see my answer on How Negative Lookahead Works. A negative lookbehind works similarly, but with the text before the matching (consuming) pattern.

A JS solution is to use capturing groups:

var re = /(?:^|\D)(2[0-3]:[0-5][0-9]:[0-5][0-9]|[0-1][0-9]:[0-5][0-9]:[0-5][0-9])(?=\D|$)/g;
var text = "Some23:56:43text here Some13:26:45text there and here is a date 10/30/89T11:19:00am";
while ((m=re.exec(text)) !== null) {
  document.body.innerHTML += m[1] + "<br/>";
}

Upvotes: 2

nickb
nickb

Reputation: 59699

The regex class for "anything that is not numeric" is:

\D

This is equivalent to:

[^\d]

So you would use:

\D*(2[0-3]:[0-5][0-9]:[0-5][0-9]|[0-1][0-9]:[0-5][0-9]:[0-5][0-9])\D*

You don't need to surround it with a non-capturing group (?:).

Upvotes: 2

Related Questions