Malachi
Malachi

Reputation: 33720

Java Regex Issues

I am trying to match a character e.g. ' if it doesn't have the character \ before it.

Valid État de l\'impression

Invalid État de l'impression

Valid Saisir l\'utilisateur et le domaine pour la connexion

I believe what I am after is sort of assertion such as a negative lookbehind?

e.g. (?<!\\)' which works fine when I am testing in RegexBuilder

However the problem is when I am trying to make this work in Java

Code

String[] inputs = new String[] { "Recherche d'imprimantes en cours…", "Recherche  d\\'imprimantes en cours…" } ;

for(String input : inputs)
{
    Pattern p = Pattern.compile("(?<!\\\\)'");
    System.out.println(input);
    System.out.println(p.matcher(input).matches());
}

Output

Recherche d'imprimantes en cours…
false
Recherche  d\'imprimantes en cours…
false

Which should match true, false

Upvotes: 2

Views: 233

Answers (3)

wholerabbit
wholerabbit

Reputation: 11567

Don't use Pattern.compile() on the same pattern in a loop -- it defeats the purpose of the "compile".

String[] inputs = new String[] { 
    "Recherche d'imprimantes en cours…", 
    "Recherche  d\\'imprimantes en cours…" 
};
Pattern pat = Pattern.compile("(?<!\\\\)'");

for (String s : inputs) {
    Matcher mat = pat.matcher(s);
    while (mat.find()) {
        System.out.format("In \"%s\"\nFound: \"%s\" (%d, %d)\n",
            s, mat.group(), mat.start(), mat.end());
    }
}   

Output:

In "Recherche d'imprimantes en cours…"
Found: "'" (11, 12)

Upvotes: 2

Thomas
Thomas

Reputation: 88757

The regex should work fine, but Matcher#matches() doesn't work as you believe it does. It only returns true of the expression matches the entire string.

From the JavaDoc on Matcher#matches():

Attempts to match the entire region against the pattern.

Upvotes: 1

Bart Kiers
Bart Kiers

Reputation: 170308

p.matcher(input).matches() validates the entire input. Try p.matcher(input).find() instead.

Upvotes: 3

Related Questions