user7401478
user7401478

Reputation: 1376

RegEx matcher in Android

I'm having trouble setting up a RegEx matcher in Android environment.

My String pattern:

private static final String INVALID_PATTERN = "/[^а-яa-z0-9\\s,!\\-_{\\}\\[\\];+]/ig";

Unescaped pattern (matches everything, but cyrillic and latin letters, numbers, space, comma, exclamation mark, minus, underscore, square brackets, semicolon and plus globally ignoring case; I consider those "legal"):

/[^а-яa-z0-9\s,!\-_\[\];+]/ig

My code:

public static ErrorType createStory(@NonNull String name){
    Matcher m = Pattern.compile(INVALID_PATTERN).matcher(name);
    if(m.matches()){
        Log.e("Error", "Story name '" + name + "' contains illegal characters.");
        return ErrorType.ILLEGAL;
    }
    //...
}

This, however, neither throws any errors nor does work.

What I tried so far and didn't work (where string is a String variable):

Upvotes: 1

Views: 534

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626896

You need to use

private static final String INVALID_PATTERN = "(?i)[а-яёa-z0-9\\s,!_{}\\[\\];+-]+";

and use it as

public static ErrorType createStory(@NonNull String name){
    Matcher m = Pattern.compile(INVALID_PATTERN).matcher(name);
    if(!m.matches()){
        Log.e("Error", "Story name '" + name + "' contains illegal characters.");
        return ErrorType.ILLEGAL;
    }
    //...
}

Explanation:

  • The (?i)[а-яёa-z0-9\\s,!_{}\\[\\];+-]+ pattern matches the specified ranges and chars in a case-insensitive way (due to the embedded flag option (?i)), 1 or more occurrences
  • Since the regex matches a valid string, if (!m.matches()) is used to only show the error if the regex does not match the string
  • As .matches() requires a full string match, no ^ and $ anchors are necessary in the pattern
  • In Android regex, regex delimiters are not used, and the way you pass regex options is either via Pattern.<FLAG> or via inline modifiers (as, e.g. (?i))
  • Judging by the range of Cyrillic letters, you want to match Russian letters, but а-я does not include ё, that is why I included it into the character class
  • Always put the hyphen at the start or end of the character class, it will always be parsed as a literal - symbol. It is best practice, and will work in any regex flavor (if placed at the start - with any flavor I know).

If you want to use a negative approach, use

private static final String INVALID_PATTERN = "(?i)[^а-яёa-z0-9\\s,!_{}\\[\\];+-]";

and in the code, use if (m.find())

public static ErrorType createStory(@NonNull String name){
    Matcher m = Pattern.compile(INVALID_PATTERN).matcher(name);
    if(m.find()){
        Log.e("Error", "Story name '" + name + "' contains illegal characters.");
        return ErrorType.ILLEGAL;
    }
    //...
}

Then, the error will be shown if the chars other than those defined in the negated character class are present in the string. .find() does not require a full string match, it allows partial matches.

Upvotes: 1

Related Questions