Reputation: 6251
How can I write a regex that matches only letters?
Upvotes: 621
Views: 1383812
Reputation: 29
find student start with specific letter A : ^[aA].*
Find student start with specific letter A and End with specific letter y :^[aA].*[y]$
meaning :-> ^[aA]. 1st part; .* anything ; *[y]$ last part ;
List<Student> students = list.stream().filter(e->e.getFirstName().matches("^[aA].*[y]$")).collect(Collectors.toList());
Upvotes: -1
Reputation: 299
You would use
/[a-z]/gi
[]
checks for any characters between given inputsa-z
covers the entire alphabetg
globally throughout the whole stringi
getting upper and lowercaseUpvotes: 29
Reputation: 3443
The answers here either do not cover all possible letters, or are incomplete.
Complete regex to match ONLY unicode LETTERS, including those made up of multiple codepoints:
^(\p{L}\p{M}*)+$
(based on @ZoFreX comment)
Test it here: https://regex101.com/r/Mo5qdq/1
Upvotes: 1
Reputation: 5565
This one works for me, ONLY unicode characters (not valid for numbers, special characters, emojis ...)
// notice: unicode: true
RegExp(r"^[\p{L}\p{M} ]*$", unicode: true)
Upvotes: 0
Reputation: 1330
So, I've been reading a lot of the answers, and most of them don't take exceptions into account, like letters with accents or diaeresis (á, à, ä, etc.).
I made a function in typescript that should be pretty much extrapolable to any language that can use RegExp. This is my personal implementation for my use case in TypeScript. What I basically did is add ranges of letters with each kind of symbol that I wanted to add. I also converted the char to upper case before applying the RegExp, which saves me some work.
function isLetter(char: string): boolean {
return char.toUpperCase().match('[A-ZÀ-ÚÄ-Ü]+') !== null;
}
If you want to add another range of letters with another kind of accent, just add it to the regex. Same goes for special symbols.
I implemented this function with TDD and I can confirm this works with, at least, the following cases:
character | isLetter
${'A'} | ${true}
${'e'} | ${true}
${'Á'} | ${true}
${'ü'} | ${true}
${'ù'} | ${true}
${'û'} | ${true}
${'('} | ${false}
${'^'} | ${false}
${"'"} | ${false}
${'`'} | ${false}
${' '} | ${false}
Upvotes: 8
Reputation: 391
In python, I have found the following to work:
[^\W\d_]
This works because we are creating a new character class (the []
) which excludes (^
) any character from the class \W
(everything NOT in [a-zA-Z0-9_]
), also excludes any digit (\d
) and also excludes the underscore (_
).
That is, we have taken the character class [a-zA-Z0-9_]
and removed the 0-9
and _
bits. You might ask, wouldn't it just be easier to write [a-zA-Z]
then, instead of [^\W\d_]
? You would be correct if dealing only with ASCII text, but when dealing with unicode text:
\W
Matches any character which is not a word character. This is the opposite of \w. > If the ASCII flag is used this becomes the equivalent of [^a-zA-Z0-9_].
^ from the python re module documentation
That is, we are taking everything considered to be a word character in unicode, removing everything considered to be a digit character in unicode, and also removing the underscore.
For example, the following code snippet
import re
regex = "[^\W\d_]"
test_string = "A;,./>>?()*)&^*&^%&^#Bsfa1 203974"
re.findall(regex, test_string)
Returns
['A', 'B', 's', 'f', 'a']
Upvotes: 26
Reputation: 437
/^[A-z]+$/.test('asd')
// true
/^[A-z]+$/.test('asd0')
// false
/^[A-z]+$/.test('0asd')
// false
Upvotes: 2
Reputation: 1566
JavaScript
If you want to return matched letters:
('Example 123').match(/[A-Z]/gi)
// Result: ["E", "x", "a", "m", "p", "l", "e"]
If you want to replace matched letters with stars ('*') for example:
('Example 123').replace(/[A-Z]/gi, '*')
//Result: "****** 123"*
Upvotes: 2
Reputation: 507
Lately I have used this pattern in my forms to check names of people, containing letters, blanks and special characters like accent marks.
pattern="[A-zÀ-ú\s]+"
Upvotes: 5
Reputation: 243
Java:
String s= "abcdef";
if(s.matches("[a-zA-Z]+")){
System.out.println("string only contains letters");
}
Upvotes: 18
Reputation: 219
Use character groups
\D
Matches any character except digits 0-9
^\D+$
See example here
Upvotes: 9
Reputation: 348
Regular expression which few people has written as "/^[a-zA-Z]$/i" is not correct because at the last they have mentioned /i which is for case insensitive and after matching for first time it will return back. Instead of /i just use /g which is for global and you also do not have any need to put ^ $ for starting and ending.
/[a-zA-Z]+/g
Upvotes: 16
Reputation: 215
Pattern pattern = Pattern.compile("^[a-zA-Z]+$");
if (pattern.matcher("a").find()) {
...do something ......
}
Upvotes: -2
Reputation: 29061
The closest option available is
[\u\l]+
which matches a sequence of uppercase and lowercase letters. However, it is not supported by all editors/languages, so it is probably safer to use
[a-zA-Z]+
as other users suggest
Upvotes: 39
Reputation: 1155
pattern = /[a-zA-Z]/
puts "[a-zA-Z]: #{pattern.match("mine blossom")}" OK
puts "[a-zA-Z]: #{pattern.match("456")}"
puts "[a-zA-Z]: #{pattern.match("")}"
puts "[a-zA-Z]: #{pattern.match("#$%^&*")}"
puts "[a-zA-Z]: #{pattern.match("#$%^&*A")}" OK
Upvotes: 1
Reputation: 484
Just use \w
or [:alpha:]
. It is an escape sequences which matches only symbols which might appear in words.
Upvotes: 10
Reputation: 8367
If you mean any letters in any character encoding, then a good approach might be to delete non-letters like spaces \s
, digits \d
, and other special characters like:
[!@#\$%\^&\*\(\)\[\]:;'",\. ...more special chars... ]
Or use negation of above negation to directly describe any letters:
\S \D and [^ ..special chars..]
Pros:
Cons:
Upvotes: 6
Reputation: 3557
Depending on your meaning of "character":
[A-Za-z]
- all letters (uppercase and lowercase)
[^0-9]
- all non-digit characters
Upvotes: 74
Reputation: 655129
Use a character set: [a-zA-Z]
matches one letter from A–Z in lowercase and uppercase. [a-zA-Z]+
matches one or more letters and ^[a-zA-Z]+$
matches only strings that consist of one or more letters only (^
and $
mark the begin and end of a string respectively).
If you want to match other letters than A–Z, you can either add them to the character set: [a-zA-ZäöüßÄÖÜ]
. Or you use predefined character classes like the Unicode character property class \p{L}
that describes the Unicode characters that are letters.
Upvotes: 613
Reputation: 1541
/[a-zA-Z]+/
Super simple example. Regular expressions are extremely easy to find online.
http://www.regular-expressions.info/reference.html
Upvotes: 14
Reputation: 28636
\p{L}
matches anything that is a Unicode letter if you're interested in alphabets beyond the Latin one
Upvotes: 282