dingledooper
dingledooper

Reputation: 170

Match exactly one regex term

I want a regex which only matches one term from multiple. For example, given the string 01010, I want to match the 0s with exactly one neighboring 1.

The regex I have currently is 0(?=1)|(?<=1)0, but it matches for all of the 0s (0 0 0), when I really want to exclude the middle one (0 0), since it has two neighbors, not one.

This might not be that hard since there are only two terms to check for, but it seems harder if the number of terms is greater. For example, what if I not only want to check for a neighboring 1, but also a 1 that is exactly 3 characters away?

Upvotes: 1

Views: 361

Answers (2)

Cary Swoveland
Cary Swoveland

Reputation: 110675

The following regular expression addresses a generalization of the question. It matches every character in a string that: 1) is followed by the same character and not preceded by the same character; or 2) is preceded by the same character and not followed by the same character.

^(.)(?=\1)|(?<=(.))(?=\2).$|(?<=(.))(?:(?=\3).(?!\3)|(?!\3)(.)(?=\4))

Demo

The regex engine performs the following operations.

^          match beginning of line
(.)        match first char and save to capture group 1
(?=\1)     following char is the same char
|          or
(?<=(.))   save the preceding char to capture group 2  
(?=\2)     char equals preceding char
.          match char
$          match end of line
|          or
(?<=(.))   save preceding char to capture group 3
(?:        begin a non-capture group
  (?=\3)   char equals preceding char
  .        match char
  (?!\3)   following char is a different
  |        or
  (?!\3)   char does not equal preceding char
  (.)      save char in capture group 4
  (?=\4)   following char is the same
)          end non-capture group

Upvotes: 0

ICloneable
ICloneable

Reputation: 633

Your pattern matches if a 0 is either followed by or preceded by a 1 but there's no restriction that it must be only one of them. You can add a negative Lookbehind and a negative Lookahead to achieve that.

Try something like following:

(?<!1)0(?=1)|(?<=1)0(?!1)

Demo


Edit

If you want to match if the 0 has a 1 neighbor or a 1 that is 3 characters away, things will get a little more complicated but we basically follow the same rule. Something like this would work:

(?<!1|1.{2})0(?=1|.{2}1)|(?<=1|.{2}1)0(?!1|.{2}1)

Demo.

Upvotes: 4

Related Questions