ImranRazaKhan
ImranRazaKhan

Reputation: 2297

Regular Expression issue with unicode characters

I have following reqular expression it suppose to accept following inputs

yes
no
b 03211111111 10

Pattern:

Pattern.compile(
    "^((B\\s(92|0)?(3[0-9]{2,9})\\s([1-9][0-9]|1[0-9]{2}|200))|(y)|(yes)|(n)|(no))$",
    Pattern.CASE_INSENSITIVE
);

but today i found that it accept one input like following

b 03211111111 10?

in above line that question mark is in opposite direction and i dont know how i can type here.

it look like some unicode character, i just want to restrict my regular expression to just get input like

b 03211111111 10

Following is code

balShareReq =   Pattern.compile("^((B\\s(92|0)?(3[0-9]{2,9})\\s([1-9][0-9]|1[0-9]{2}|200))|(y)|(yes)|(n)|(no))$",Pattern.CASE_INSENSITIVE);
Matcher matcher   =   balShareReq.matcher(vo.getMessage());             
            if( matcher.find() ) {
//my business logic 
}

Regards, imran

Upvotes: 2

Views: 192

Answers (2)

beerbajay
beerbajay

Reputation: 20270

You have some other error in your program:

Pattern p = Pattern.compile(
"^((B\\s(92|0)?(3[0-9]{2,9})\\s([1-9][0-9]|1[0-9]{2}|200))|(y)|(yes)|(n)|(no))$",
Pattern.CASE_INSENSITIVE
);

p.matcher("b 03211111111 10?").matches();  // false
p.matcher("b 03211111111 10¿").matches();  // false
p.matcher("b 03211111111 10⸮").matches();  // false

Update

You're using find() where you should probably be using matches(). From the java doc, matches:

Attempts to match the entire region against the pattern.

While find:

Attempts to find the next subsequence of the input sequence that matches the pattern.

But even find should not match with your given pattern, unless the ¿ is on a line after the main pattern (since you have $ at the end of your pattern).

Upvotes: 1

cyptus
cyptus

Reputation: 3396

Your regex fails because of your $ at the end. The $ means the regex must pass until the end of the string is reached. Remove the $ or allow some other chars before the end of the string.

-> "...|(y)|(yes)|(n)|(no))¿?$" would pass the ¿ char (? means optional in regex)

-> "...|(y)|(yes)|(n)|(no))" would pass all chars at the end

Upvotes: 0

Related Questions