user2725402
user2725402

Reputation: 4629

Python Regex validation

I am brand new to Python.

I'm trying to ensure a username contains ONLY alpha characters (only a-z). I have the below code. If I type digits only (e.g. 7777) it correctly throws the error. If I type numbers and letters mix, but I START with a number, it also rejects. But if I start with a letter (a-z) and then have numbers in the string as well, it accepts it as correct. Why?

def register():
    uf = open("user.txt","r")
    un = re.compile(r'[a-z]')
    up = re.compile(r'[a-zA-Z0-9()$%_/.]*$')
    print("Register new user:\n")
    new_user = input("Please enter a username:\n-->")
    if len(new_user) > 10:
        print("That username is too long. Max 10 characters please.\n")
        register()
    #elif not un.match(new_user):
    elif not re.match('[a-z]',new_user):
        print("That username is invalid. Only letters allowed, no numbers or special characters.\n")
        register()
    else:
        print(f"Thanks {new_user}")

Sample

Upvotes: 0

Views: 238

Answers (3)

Gsk
Gsk

Reputation: 2945

in your code, uf, un and up are unused variables.

the only point where you validate something is the line elif not re.match('[a-z]',new_user):, and you just check if there is at least one lowercase char.

To ensure that a variable contains only letters, use: elif not re.match('^[a-zA-Z]{1,10}$',new_user):

in the regex ^[a-zA-Z]{1,10}$ you find:

  • ^ : looks for the start of the line
  • [a-zA-Z] : looks for chars between a and z and between A and Z
  • {1,10} : ensure that the char specified before (letter) is repeated between 1 and 10 times. As LhasaDad is suggesting in the comments, you may want to increase the minimum number of characters, e.g. to 4: {4,10}. We don't know what this username is for, but 1 char seems in any case too low.
  • $ : search for the end of the line

Since you were looking for a RegEx, I've produced and explained one, but Guy's answer is more pythonic.

IMPORTANT:
You're not asking for this, but you may encounter an error you're not expecting: since you're calling a function inside itself, you have a recursive function. If the user provides too many times (more than 1000) the wrong username, you'll receive a RecursionError

Upvotes: 2

Guy
Guy

Reputation: 50819

Why don't you use isalpha()?

string = '333'
print(string.isalpha()) # False

string = 'a33'
print(string.isalpha()) # False

string = 'aWWff'
print(string.isalpha()) # True

Upvotes: 5

bereal
bereal

Reputation: 34281

As the re.match docs say:

If zero or more characters at the beginning of string match the regular expression pattern, return a corresponding match object.

That's exactly what's happening in your case: a letter in the beginning of the string will satisfy the match. Try the expression [a-z]+$ which will make sure that the match expands till the end of the string.

You can check the length on the same go: [a-z]{1,10}$.

Upvotes: 0

Related Questions