Stefano Cazzola
Stefano Cazzola

Reputation: 1687

Java - Pattern matching strange behaviour

I have to perform partial pattern matching, so I tested pattern matching against the following input

Pattern p = Pattern.compile("hello");
Matcher m = p.matcher("[a-z]");

Can anybody explains me why

System.out.println(m.find() || m.hitEnd());

prints true while

System.out.println(m.hitEnd());

prints false?

Upvotes: 2

Views: 1773

Answers (3)

gaborsch
gaborsch

Reputation: 15758

UPDATE:

Because m.find() itself processes the pattern fully, but does not find a match (and returns false). The pattern is fully consumed after this call, so hitEnd() will result true.

In the second call, the pattern is not consumed, so hitEnd() returns false.

For hitEnd() the Javadoc says:

Returns true if the end of input was hit by the search engine in the last match operation performed by this matcher.

Reflecting on comment from @jlordo : Maybe you want to change the pattern and the text:

Pattern p = Pattern.compile("[a-z]");
Matcher m = p.matcher("hello");

because "[a-z]" rather looks like a pattern.

Upvotes: 2

jlordo
jlordo

Reputation: 37853

Look at this program:

Pattern p = Pattern.compile("hello");
Matcher m = p.matcher("[a-z]");
System.out.println(m.hitEnd()); // prints false
System.out.println(m.find());  // prints false
System.out.println(m.hitEnd()); // prints true

Notice, the first call of m.hitEnd() returns false. Look at JavaDoc, it says:

Returns true if the end of input was hit by the search engine in the last match operation performed by this matcher.

Here it returns false, because it is called before the call of m.find(), so the matcher hasn't performed any match operations, yet. After the call of m.find() it returns true (because find() consumes the complete input string and hits the end). The meaning of that is also explained in JavaDoc:

When this method returns true, then it is possible that more input would have changed the result of the last search.

When this returns true, it means the matcher hit the end of the input. In this case, hit means reached, not matched. (The input was completely consumed by the matcher).

EDIT

I hope it is wanted by you, that [a-z] is the input string for your regular expression hello, and it's not the other way around. If you had

Pattern p = Pattern.compile("[a-z]"); // The regex needs to be compiled.
Matcher m = p.matcher("hello");       // The input is given to the matcher
while (m.find()) {                    // In this case, returns true 5 times
    System.out.println(m.group() + ", ");
}

your output would be

h, e, l, l, o, 

Upvotes: 2

PinkElephantsOnParade
PinkElephantsOnParade

Reputation: 6592

System.out.println(m.find() || m.hitEnd());

m.find() returns a boolean - that boolean is false. We just processed hit the end of input searching through - that makes m.hitend() result in true. false || true equates to true, accordingly.

The previous operation hit the end - hitend() returns true if..well, here's javadoc:

Returns true if the end of input was hit by the search engine in the last 
match operation performed by this matcher.

We DIDN'T reach the end...on the last operation. So hitend() is false. Successive calls would result in false.

Upvotes: 0

Related Questions