Stumbler
Stumbler

Reputation: 2146

regex: consecutive match failure

Using the regular expression

[^a-zA-Z]([A-Z][&+-\/\\][A-Z](([&+-\/\\][A-Z])+[^a-zA-Z\d:]))

and wanting to match letters delineated by symbols, the expression achieves successful matches, but fails to properly match patterns that immediately follow correct matches. Note that it is case insensitive in execution.

For instance in the example

pizza a.b.c C/A/R/L about R/O/F/L s

a.b.c and R/O/F/L are correctly matched, but C/A/R/L is only partially matched (A/R/L). How can this be fixed?

Below is a regex101 mockup: but confusingly it doesn't seem to exhibit the same sort of behaviour as I am otherwise seeing.

https://www.regex101.com/r/zV8wI0/1

Upvotes: 1

Views: 71

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626936

You can use

\b[a-zA-Z](?:[./][A-Za-z])*\b

See the regex demo

If you do not need to match whole words, remove \b (word boundary).

Explanation:

  • \b - leading word boundary
  • [a-zA-Z] - 1 letter
  • (?:[./][A-Za-z])* - zero or more sequences (NOTE: if you need at least one . or /, replace * with a +) of:
    • [./] - a dot or a / symbol
    • [A-Za-z] - 1 letter
  • \b - trailing word boundary

If you need to match c.a and r.l in c.a/r.l, you need to use something like

\b[a-zA-Z](?:(?:\.[A-Za-z])+|(?:/[A-Za-z])+)\b

See another regex demo.

Upvotes: 1

Related Questions