Karthik power
Karthik power

Reputation: 31

Match regex for given statement

I want to write regex for the following statement and match the bolded characters "The following strings must be matched xyz.90001DUS.annotations and xyz.765896DUS.courses".

I tried to write one using regex but it is not matching above strings, can someone please help me?

It should match whole of bolded strings, this is the only criteria.

^xyz.([0-9])?DUS.annotations(.*)?\.annotations$

Upvotes: 1

Views: 38

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627410

Your ^xyz.([0-9])?DUS.annotations(.*)?\.annotations$ cannot match the strings inside a longer string due to anchors, ^ and $. Besides, . matches any char other than line break chars, ([0-9])? matches a single optional digit (while you have five in 90001). The (.*)?\.annotations part would match any zero or more chars other than line break chars as many as possible consuming chars up to the last occurrence of .annotations.

What you can use is

xyz\.\d+DUS\.\w+

Or, with word boundaries:

\bxyz\.\d+DUS\.\w+        <<< In most NFA regex flavors
\yxyz\.\d+DUS\.\w+        <<< In PostgreSQL, Tcl
\mxyz\.\d+DUS\.\w+        <<< R (TRE), Tcl
\<xyz\.\d+DUS\.\w+        <<< GNU word boundary
[[:<:]]xyz\.\d+DUS\.\w+   <<< POSIX word boundary

See the regex demo. You do not need a word boundary after \w+, there is always a word boundary after the trailing \w+ in any regex pattern.

Details:

  • xyz\. - xyz.
  • \d+ - one or more digits
  • DUS\. - DUS.
  • \w+ - one or more word chars.

Upvotes: 1

Related Questions