Reputation: 105053
This is my Java 1.5 code (complete example):
import org.junit.Test;
import static org.junit.Assert.*;
import java.util.Scanner;
import java.util.regex.Pattern;
public class StrangeTest {
@Test
public void testRegExp() {
Pattern re = Pattern.compile("(;|:)[^:;]*");
Scanner scanner = new Scanner(":alpha");
scanner.useDelimiter("");
assertEquals(":alpha", scanner.next(re)); // failure
}
}
What is wrong here?
Upvotes: 2
Views: 370
Reputation: 20732
How do you justify calling scanner.useDelimiter("");
? Your matchers work fine if you leave it out..
Upvotes: 0
Reputation: 5725
I dont think the Scanner class works the way your expecting..
Scanner scanner = new Scanner(":alpha;beta");
scanner.useDelimiter("(;|:).*?");
System.out.println(scanner.next()); // gives alpha
Upvotes: 1
Reputation: 14716
Basically your regular expression matches any string that starts with a :
, even if it is only one character: :
matches the expression as well as :a
, :al
,... :alpha
. Even :alpha;beta
is a match!
With the question mark you appended to your expression you made it non-greedy, i.e. the shortest possible string is matched, which is :
.
Remove the question mark to make it greedy:
Pattern re = Pattern.compile("(;|:).*");
However, then it will match :alpha;beta
, so you need to indicate that, following the semicolon or colon character, you expect any characters except the semicolon or colon:
Pattern re = Pattern.compile("(;|:)[^;:]*");
Upvotes: 3