Bhavya Gupta
Bhavya Gupta

Reputation: 61

How do I get a regex expression to contain only uppercase letters or numbers?

Regex expression: [A-Z]([^0-9]|[^A-Z])+[A-Z]

The requirements are that the string should start and end with a capital letter A-Z, and contain at least one number in between. It should not have anything else besides capital letters on the inside. However, it's accepting spaces and punctuation too.

My expression fails the following test case A65AJ3L 3F,D due to the comma and whitespace.

Why does this happen when I explicitly said only numbers and uppercase letters can be in the string?

Upvotes: 0

Views: 9416

Answers (2)

Ryszard Czech
Ryszard Czech

Reputation: 18621

Use

^(?=\D*\d\D*$)[A-Z][A-Z\d]*[A-Z]$

See regex proof.

(?=\D*\d\D*$) requires only one digit in the string, no more no less.

EXPLANATION

--------------------------------------------------------------------------------
  ^                        the beginning of the string
--------------------------------------------------------------------------------
  (?=                      look ahead to see if there is:
--------------------------------------------------------------------------------
    \D*                      non-digits (all but 0-9) (0 or more
                             times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    \d                       digits (0-9)
--------------------------------------------------------------------------------
    \D*                      non-digits (all but 0-9) (0 or more
                             times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    $                        before an optional \n, and the end of
                             the string
--------------------------------------------------------------------------------
  )                        end of look-ahead
--------------------------------------------------------------------------------
  [A-Z]                    any character of: 'A' to 'Z'
--------------------------------------------------------------------------------
  [A-Z\d]*                 any character of: 'A' to 'Z', digits (0-9)
                           (0 or more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  [A-Z]                    any character of: 'A' to 'Z'
--------------------------------------------------------------------------------
  $                        before an optional \n, and the end of the
                           string

Upvotes: 0

The fourth bird
The fourth bird

Reputation: 163457

Starting the character class with [^ makes is a negated character class.

Using ([^0-9]|[^A-Z])+ matches any char except a digit (but does match A-Z), or any char except A-Z (but does match a digit).

This way it can match any character.

If you would turn it into [A-Z]([0-9]|[A-Z])+[A-Z] it still does not make it mandatory to match at least a single digit on the inside due to the alternation | and it can still match AAA for example.

You might use:

^[A-Z]+[0-9][A-Z0-9]*[A-Z]$

The pattern matches:

  • ^ Start of string
  • [A-Z]+ Match 1+ times A-Z
  • [0-9] Match a single digit
  • [A-Z0-9]* Optionally match either A-Z or 0-9
  • [A-Z] Match a single char A-Z
  • $ End of string

Regex demo

Upvotes: 4

Related Questions