Reputation: 11
input strings consists of letters I N P U Y X
-I have to verify that it only contains these letters and nothing else in PERL regexp
-verify that input also contains at least 2 occurrences of "NP" (without quotes)
example string:
INPYUXNPININNPXX
strings are all in uppercase
Upvotes: 0
Views: 81
Reputation: 386541
The cleanest solution is:
/^[INPUXY]*\z/ && /NP.*NP/s
The following is the most efficient as it avoids matching the string twice and it prevents backtracking on failure:
/
^
(?: (?:[IPUXY]|N[IUXY])* NP ){2}
[INPUXY]*
\z
/x
To capture what's between the two NP, you can use
/
^
(?:[IPUXY]|N[IUXY])* NP
( (?:[IPUXY]|N[IUXY])* ) NP
[INPUXY]*
\z
/x
Upvotes: 1
Reputation: 73014
Use this:
^[INPUYX]*NP[INPUYX]*?NP[INPUYX]*$
See it in action: http://regex101.com/r/vI2xQ6
Effectively what we're doing here is allowing 0 or more of your character class, capturing the first (required) occurrence of NP, then ensuring that it occurs at least once again before the end of the string.
Hypothetically if you wanted to capture out the middle, you could do:
^(?=(?:(.*?)NP){2})[INPUYX]+$
Or as @ikegami points out (matching ONLY the single line) \A(?=(?:(.*?)NP){2})[INPUYX]+\z
.
Upvotes: 1
Reputation: 785866
You can use this lookahead based regex in PCRE:
^(?=(?:.*?NP){2})[INPUYX]+$
Explanation:
^ assert position at start of a line
(?=(?:.*?NP){2}) Positive Lookahead - Assert that the regex below can be matched
(?:.*?NP){2} Non-capturing group
Quantifier: Exactly 2 times
.*? matches any character (except newline)
Quantifier: Between zero and unlimited times, as few times as possible, expanding as needed [lazy]
NP matches the characters NP literally (case sensitive)
[INPUYX]+ match a single character present in the list below
Quantifier: Between one and unlimited times, as many times as possible, giving back as needed [greedy]
INPUYX a single character in the list INPUYX literally (case sensitive)
$ assert position at end of a line
Upvotes: 4