Reputation: 2858
This question is based on this question.
I am using \P{M}\p{M}*
in order to match all letters (both from German and French language).
I chose this regex in order to avoid defining every unicode character such as:
^[a-zA-Z[\\u00c0-\\u01ff]]+[\\']?(([-]?[a-zA-Z[\\u00c0-\\u01ff]]*[\\s]?)|([\\s]?[a-zA-Z[\\u00c0-\\u01ff]]*[-]?)){1,2}[a-zA-Z[\\u00c0-\\u01ff]]+$
However, despite using the unicode format defined in the previous question, characters such as ß
or è
are not matched by the regex.
I am using JDK 6.
What am I missing. Thanks!
Upvotes: 3
Views: 2415
Reputation: 425073
Use the posix character class \p{L}
for "any letter":
System.out.println("abcßè".matches("\\p{L}+")); // true
Upvotes: 3
Reputation: 349
using java 6 this code
public static void main(String[] args) {
String str = "hello ß you";
Pattern p = Pattern.compile("(:?\\P{M}\\p{M}*)+");
Matcher matcher = p.matcher(str);
System.out.println("replaced: '" + matcher.replaceAll("") + "'");
}
returns: replaced: ''
The 'ß' is matched
Upvotes: 0