Alex
Alex

Reputation: 3968

Regex pattern matching and its limitations

I am learning about Regex capabilities, and have a basic understanding of them.

I am now looking into how to use them for pattern matching. One of the things I am not sure about is if they can be used for pattern matching.

I have come accross this specific pattern and am wondering if a Regex would be appropriate to evaluate if the pattern matches?

ABBA

CDDC

DUUD

In the above, the first and last must match, and also the middle 2 - is this the kind of pattern a Regex can be used to match?

If I was to then add these combinations to the patterns above, could a Regex still match it?

ACACR

DJDJB

Again, the pattern here is about items matching at given indexes, so the value at position 0 also appears at position 2?

Is this an appropriate use for a Regex or should I use alternate means.

To be clear, my question is about if a Regex can solve this type of problem, rather other ways to solve it

Upvotes: 0

Views: 322

Answers (3)

AmigoJack
AmigoJack

Reputation: 6108

A regular expression cannot detect a palindrome per se, only those with fixed lengths (as in "always 4" or "always 10" characters).

Although it looks rather trivial to have such a feature I yet haven't encountered an implementation that provides such a case. Text editors may have additional flavors such as these:

  • \L all subsequent characters are in lower case until \E
  • \i outputs a sequence number incremented by 1
  • \p outputs the clipboard content

...so something like \R(1) could also provide the feature "first capture reversed". But as told before: not encountered yet, and unlikely to ever encounter.

Upvotes: 1

npinti
npinti

Reputation: 52185

What you are talking about are called Back References. This is what allows you to build a pattern with a pattern you have previously found.

Your first 2 examples can be matched by this regular expression: ^(.)(.)\2\1$ (example here). In this expression, we are matching the first character and throwing it into a group (we need this so that we can access it later in the pattern matching). We are also doing the same with the second character. We are then instructing the regex engine that on the 3rd character, we are expecting the same value of our second group. This is denoted by \2 or $2 in some languages. We are then saying that for the 4th character, we are expecting the same character we have stored in group 1.

You can extend the above to match a sequence of characters which are stored in a group.

For your 3rd and 4th examples, you can build on top of the example above and make it suit your needs.

Upvotes: 0

Timekiller
Timekiller

Reputation: 3136

Yes, you can use backreferences for that. For the first pattern, it would be like '([A-Z])([A-Z])\2\1'.

Upvotes: 0

Related Questions