gstackoverflow
gstackoverflow

Reputation: 37034

Check that string contains non-latin letters

I have the following method to check that string contains only latin symbols.

private boolean containsNonLatin(String val) {
        return val.matches("\\w+");
}

But it returns false if I pass string: my string because it contains space. But I need the method which will check that if string contains letters not in Latin alphabet it should return false and it should return true in all other cases.

Please help to improve my method.

examples of valid strings:

w123.
w, 12
w#123
dsf%&@

Upvotes: 4

Views: 8313

Answers (4)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626747

I need something like not p{IsLatin}

If you need to match all letters but Latin ASCII letters, you can use

"[\\p{L}\\p{M}&&[^\\p{Alpha}]]+"

The \p{Alpha} POSIX class matches [A-Za-z]. The \p{L} matches any Unicode base letter, \p{M} matches diacritics. When we add &&[^\p{Alpha}] we subtract these [A-Za-z] from all the Unicode letters.

The whole expression means match one or more Unicode letters other than ASCII letters.

To add a space, just add \s:

"[\\s\\p{L}\\p{M}&&[^\\p{Alpha}]]+"

See IDEONE demo:

List<String> strs = Arrays.asList("w123.", "w, 12", "w#123", "dsf%&@", "Двв");
for (String str : strs)
    System.out.println(!str.matches("[\\s\\p{L}\\p{M}&&[^\\p{Alpha}]]+")); // => 4 true, 1 false

Upvotes: 5

anubhava
anubhava

Reputation: 785058

You can use \p{IsLatin} class:

return !(var.matches("[\\p{Punct}\\p{Space}\\p{IsLatin}]+$"));

Java Regex Reference

Upvotes: 8

Bhuwan Prasad Upadhyay
Bhuwan Prasad Upadhyay

Reputation: 3056

User this :

public static boolean isNoAlphaNumeric(String s) {
       return s.matches("[\\p{L}\\s]+");
}
  • \p{L} means any Unicode letter.
  • \s space character

Upvotes: 0

shmosel
shmosel

Reputation: 50716

Just add a space to your matcher:

private boolean isLatin(String val) {
    return val.matches("[ \\w]+");
}

Upvotes: 1

Related Questions