EddieB
EddieB

Reputation: 5001

Android - Java - Regular Expression question - consecutive words not being matched

For my example I am trying to replace ALL cases of "the" and "a" in a string with a space. Including cases where these words are next to characters such as quotes and other punctuation

String oldString = "A test of the exp."
Pattern p = Pattern.compile("(((\\W|\\A)the(\\W|\\Z))|((\\W|\\A)a(\\W|\\Z)))",Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(oldString);
newString = m.replaceAll(" ");

"A test of the exp." returns "test of exp." - Yeah!

"A test of the a exp." returns "test of a exp." - Boooo!

"The a in this test is a the." returns "a in this test is the. - DoubleBoooo!

Any help would be greatly appreciated. Thanks!

Upvotes: 0

Views: 528

Answers (3)

user556260
user556260

Reputation:

Or per simplified @Robokop soln.

Pattern.compile("(\\b(the|a)\\b)",Pattern.CASE_INSENSITIVE);

or

Pattern.compile('\b(the|a)\b',Pattern.CASE_INSENSITIVE);

Not sure about quoting in Java.

Upvotes: 1

Tim Pietzcker
Tim Pietzcker

Reputation: 336198

String resultString = subjectString.replaceAll("\\b(?:a|the)\\b", " ");

\b matches at a word boundary (i. e. at the start or end of a word, where "word" is a sequence of alphanumeric characters).

(?:...) is a non-capturing group, needed to separate the alternative words (in this case a and the) from the surrounding word boundary anchors.

Upvotes: 1

Robokop
Robokop

Reputation: 926

Pattern.compile("(\\bthe\\b)|(\\ba\\b)",Pattern.CASE_INSENSITIVE);

Upvotes: 0

Related Questions