gremo
gremo

Reputation: 48899

How can i "merge" these two regular expression in PHP?

I'm learning regular expression, so please go easy with me!

Username is considered valid when does not start with _ (underscore) and if contains only word characters (letters, digits and underscore itself):

namespace Gremo\ExtraValidationBundle\Validator\Constraints;

use Symfony\Component\Validator\Constraint;
use Symfony\Component\Validator\ConstraintValidator;

class UsernameValidator extends ConstraintValidator
{
    public function validate($value, Constraint $constraint)
    {
        // Violation if username starts with underscore
        if (preg_match('/^_', $value, $matches)) {
            $this->context->addViolation($constraint->message);
            return;
        }

        // Violation if username does not contain all word characters
        if (!preg_match('/^\w+$/', $value, $matches)) {
            $this->context->addViolation($constraint->message);
        }
    }
}

In order to merge them in one regular expression, i've tried the following:

^_+[^\w]+$

To be read as: add a violation if starts with an underscore (eventually more than one) and if at least one character following is not allowed (not a letter, digit or underscore). Does not work with "_test", for example.

Can you help me to understand where I'm wrong?

Upvotes: 4

Views: 295

Answers (4)

codaddict
codaddict

Reputation: 454970

You can add a negative lookahead assertion to your 2nd regex:

^(?!_)\w+$

Which now means, try to match the entire string and not any part of it. The string must not begin with an underscore and can have one or more of word characters.

See it work

Upvotes: 5

Peter O'Callaghan
Peter O'Callaghan

Reputation: 6186

There are of course, many different ways of doing this. I'd probably look at going with something along the lines of /^(?!_)[\w\d_]+/$.

The [\w\d_]+ part combined with the anchors (^ and $), essentially assert that the entire string only consist of those characters. The (?!_) part is a negative lookahead assertion. It means check the next character isn't an underscore. Since it's right next to the ^ anchor, this ensures the first character isn't an underscore.

Upvotes: 0

The simple solution is this:

if (!preg_match('/^[a-zA-Z0-9]+$/', $value, $matches)) {

you just wanted the \w group (which includes the underscore) but without the underscore, so [a-zA-Z0-9] is equivalent to \w but without the underscore.

Upvotes: 0

ruakh
ruakh

Reputation: 183270

The problem is De Morgan's Law. ^_+[^\w]+$ will only match if it starts with one or more underscores and all subsequent characters are non-word characters. You need to match if it starts with an underscore or any character is a non-word character.

I think it's simpler, in this case, to focus on the valid usernames: they start with a word character other than an underscore, and all remaining characters are word characters. In other words, valid usernames are described by the pattern ^[^\W_]\w*$. So, you can write:

if (! preg_match('/^[^\W_]\w*$/', $value, $matches)) {

Upvotes: 1

Related Questions