theateist
theateist

Reputation: 14411

How to write regex that checks if word starts with letter and contain {3,6} numbers and letters?

I need to write regex that check following words: must start with a letter, may contain {3,16} numbers and\or letters.

I tried the following regex \b[A-Za-z]+[A-Za-z0-9]*{3,16}\b, but I get error. What is wrong?

Upvotes: 3

Views: 11505

Answers (3)

John Keyes
John Keyes

Reputation: 5604

Some sample Python code:

subject = """
This is som3 s@mpl3 text.

One possible sixteen letter word is abstractednesses. 2012 is not
a word as it does not contain any alphabetic charat3rs.

Unfortunately conventionalizations contains 20 characters.
"""

import re
words = re.compile('((?<=\s)[A-Za-z]\w{2,15})\W', re.M)
res = words.findall(subject)

# res is:
# ['This', 'som3', 'text', 'possible', 'sixteen', 'letter', 'word', 
#  'abstractednesses', 'word', 'does', 'contain', 'alphabetic', 
#  'charat3rs', 'Unfortunately', 'contains', 'characters']

Upvotes: 0

Benj
Benj

Reputation: 32418

Your problem is that that you second character class has both * and {3,16} which means the {3,16} has nothing to quantify. Additional you state that input string must start with just one letter but + means 1..many. I imagine you want:

\b                  // boundary
[A-Za-z]            // single character
[A-Za-z0-9]{2,15}   // A further 2-15 alpha numerics
\b                  // boundary

Upvotes: 3

Bohemian
Bohemian

Reputation: 425258

You are getting the error because of the *. Remove it to get a valid regex:

\b[A-Za-z]+[A-Za-z0-9]{3,16}\b

However, this regex isn't quite what you want, which is:

\b[A-Za-z][A-Za-z0-9]{2,15}\b

You need {2,15} (and not {3,16}) because the first character counts for one of the {3,16}

Upvotes: 3

Related Questions