linkyndy
linkyndy

Reputation: 17900

Regex/PHP check if group of characters appears only once

I am trying to validate an input in PHP with REGEX. I want to check whether the input has the %s character group inside it and that it appears only once. Otherwise, the rule should fail.

Here's what I've tried:

preg_match('|^[0-9a-zA-Z_-\s:;,\.\?!\(\)\p{L}(%s){1}]*$|u', $value); (there are also some other rules besides this; I've tried the (%s){1} part and it doesn't work).

I believe it is a very easy solution to this, but I'm not really into REGEX's...Thank you for your help!

Upvotes: 1

Views: 1391

Answers (5)

Alan Moore
Alan Moore

Reputation: 75222

Try this:

'|^(?=(?:(?!%s).)*%s(?:(?!%s).)*$)[0-9_\s:;,.?!()\p{L}-]+$|u'

The (%s){1} sequence inside the square brackets probably doesn't do what you think it does, but never mind, the solution is more complex. In fact, {1} should never appear anywhere in a regex. It doesn't ensure that there's only one of something, as many people assume. As a matter of fact, it doesn't do anything; it's pure clutter.


EDIT (in answer to the comment): To ensure that only one of a particular sequence is present in a string, you have to actively examine every single character, classifying it as either part-of-%s or not part-of-%s. To that end, (?:(?!%s).)* consumes one character at a time, after the negative lookahead has confirmed that the character is not the start of %s.

When that part of the lookahead expression quits matching, the next thing in the string has to be %s. Then the second (?:(?!%s).)*$ kicks in to confirm that there are no more %s sequences until the end of the string.

And don't forget that the lookahead expression must be anchored at both ends. Because the lookahead is the first thing after the main regex's start anchor you don't need to add another ^. But the lookahead must end with its own $ anchor.

Upvotes: 1

agent-j
agent-j

Reputation: 27913

If I understand your question, you need a positive lookahead. The lookahead causes the expression to only match if it finds a single %s.

preg_match('|^(?=[^%s].*?[%s][^%s]*$)[0-9a-zA-Z_-\s:;,\.\?!\(\)\p{L}(%s){1}]*$|u', $value);

I'll explain how each part works

^(?=[^%s].*?[%s][^%s]*$) is a zero-width assertion -- (?=regex) a positive lookahead -- (meaning it must match, but does not "eat" any characters). It means that the whole line can have only 1 %s.

[0-9a-zA-Z_-\s:;,\.\?!\(\)\p{L}(%s){1}]*$ The remaining part of the regex also looks at the entire string and ensures that the whole string is composed only of the characters in the character class (like your original regex).

Upvotes: 4

linkyndy
linkyndy

Reputation: 17900

I managed to do this with PHP's substr_count() function, following Johnsyweb suggestion to use an alternate way to perform the validation and because the REGEX's suggested seem pretty complicated.

Thank you again!

Upvotes: 2

johnsyweb
johnsyweb

Reputation: 141780

If you're not "into" regular expressions, why not solve this with PHP?

One call to the builtin strpos() will tell you if the string has a match. A second call will tell you if it appears more than once.

This will be easier for you to read and for others to maintain.

Upvotes: 0

Aleks G
Aleks G

Reputation: 57316

Alternatively, you can use preg_match_all with your pattern and check the number of matches. If it's 1, then you're ok - something like this:

$result = (preg_match_all('|^[0-9a-zA-Z_-\s:;,\.\?!\(\)\p{L}(%s){1}]*$|u', $value) == 1)

Upvotes: 1

Related Questions