user2022561
user2022561

Reputation: 71

Regex search for string that contains + or # using Java

I'm doing some string search using Java Pattern class. I'm trying to match string (txt) that contains "c++" or "c#" inside using java Pattern class.

String txt="c++ / c# developer";
Pattern p = Pattern.compile(".*\\b(c\\+\\+|c#)\\b.*" , Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(txt);
while (m.find()) {
...
   break;
}

m.find is always false What am i doing wrong? Thanks Ofer

Upvotes: 4

Views: 163

Answers (2)

Martin Ender
Martin Ender

Reputation: 44259

\\b is a word boundary. Which means it matches between a word and a non-word character. + and # are both non-word characters, so you require c++ or c# to be followed by a letter, digit or underscore. Try removing the \\b or replacing it with a \\B (which would require that there is another non-word character after the + or #).

Note that, when you are using find, you don't need the .* either. find will happily return partial matches. Your pattern would give you the last occurrence of either c++ or c# in the first capturing group. If that is not what you want, remove the parentheses and wildcards.

Working demo.

EDIT: If you are adding other alternatives that do end in word characters (like java). The cleanest solution would be not to use \\b or \\B at all, but create your own boundary condition using a negative lookahead. This way you are simply saying "match if there is no word character next":

\\b(c\\+\\+|c#|java)(?!\\w)

Working demo.

Upvotes: 6

Walls
Walls

Reputation: 4010

You can try using ^.*c(\+{2}|\#).*$. It says find a c followed by either 2 +'s or a #. You can see an example here.

Upvotes: 0

Related Questions