PW1981
PW1981

Reputation: 11

Spaces in Java Regular Expressions

I am new to both Java and regular expressions

I want to detect a pattern like Section :

I have a code snippet

    String line = "Section 12: sadfdggfgfgf";
    Pattern ptn = Pattern.compile("Section [0-9+]:");
    Matcher mtch = ptn.matcher(line);

When ptn = "Section [0-9+]: mtch is false

I am able to detect the pattern (mtch says TRUE) when ptn = "Section [0-9+]

Is there something I am missing about spaces in the String ? I have to assume they may or may not be spaces between Section and <Number>

Upvotes: 1

Views: 152

Answers (4)

vks
vks

Reputation: 67968

Section\s*[0-9]+:

You can use this to make sure This matches irrespective of space being there or not between section and number.

Upvotes: 0

hwnd
hwnd

Reputation: 70732

You need to place the quantifier after your character class. A character class defines a set of characters, any one of which can occur for a match to succeed. Currently you're matching any character of 0 to 9, + exactly "one time".

The reason the match returns false for your pattern with a colon is because the regex engine is trying to match a colon after a single number in which you have two numbers before the colon. The reason it returns true for the pattern without a colon is because the regex engine is able to match a single number that follows "Section "

The correct syntax would be:

Section [0-9]+:

This matches "Section" followed by a space character then any character of 0 to 9 "one or more" times and a colon.

Upvotes: 1

Avinash Raj
Avinash Raj

Reputation: 174706

Put the + outside the character class so that it would match one or more digits. [0-9+] would match only a single character from the given list (digit from the range 0-9 or +)

Pattern ptn = Pattern.compile("Section [0-9]+:");

While you running this "Section [0-9+]:" regex, it returns false because there isn't a string Section followed by a single digit or a literal + again followed by a : in your original string (Note: Your original string contains two digits followed by a colon, Section 12: sadfdggfgfgf).

But "Section [0-9+]" returns true because there is a string Section followed by a single digit.

Upvotes: 3

John
John

Reputation: 2425

If you want to accept any number of strings between Section and the number, try this regex:

Pattern.compile("Section[\\s]*[\\d]+");

For at least one space, use this:

Pattern.compile("Section[\\s]+[\\d]+");

In java regular expressions \s matches whitespace and \d matches a digit. However, since a backslash starts an escape sequence you must escape the backslash itself, which is why you end up with double backslashes.

You can read more and Java regular expressions and the Pattern class here: http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html

Upvotes: 0

Related Questions