Reputation: 45
I've got a regex written to the best of my ability that allows the latin character set only with the option of a '-' that, if included MUST be followed by at least one other latin character.
My RegEx:
[\u00BF-\u1FFF\u2C00-\uD7FFA-Za-z]+(?:[-]?[\u00BF-\u1FFF\u2C00-\uD7FFA-Za-z]+)
I came to this after reading a few posts and rereading the manual to figure out the best way to approach this. This check is attached to a text field where a user types only their first name and then submits.
It works okay but there is certainly room for improvement.
Examples:
Tom // passes
Éve // passes
John-Paul // passes
2pac // passes and removes numbers (not really what I want)
John316 // passes and removes numbers (not really what I want)
What I would REALLY want to happen is a fail on those last two checks.
How would I revise it to get the outcome I'd like?
Upvotes: 1
Views: 78
Reputation: 626699
You need to anchor the regex by adding ^
at the start and $
at the end. That way you will not let any other symbols in the input string.
I also suggest enhancing the pattern by moving ?
from after hyphen to the end (that will make regex execution linear as the hyphen has no quantifier and is required, thus, limiting backtracking):
^[\u00BF-\u1FFF\u2C00-\uD7FFA-Za-z]+(?:-[\u00BF-\u1FFF\u2C00-\uD7FFA-Za-z]+)?$
See regex demo.
JS snippet:
console.log(/^[\u00BF-\u1FFF\u2C00-\uD7FFA-Za-z]+(?:-[\u00BF-\u1FFF\u2C00-\uD7FFA-Za-z]+)?$/.test('Éve')); //=> true
console.log(/^[\u00BF-\u1FFF\u2C00-\uD7FFA-Za-z]+(?:-[\u00BF-\u1FFF\u2C00-\uD7FFA-Za-z]+)?$/.test('John-Paul')); // => true
console.log(/^[\u00BF-\u1FFF\u2C00-\uD7FFA-Za-z]+(?:-[\u00BF-\u1FFF\u2C00-\uD7FFA-Za-z]+)?$/.test('John316')); // => false
Upvotes: 1