CuriosityCalls
CuriosityCalls

Reputation: 181

Determine if a String contains *exactly* 3 words and *exactly* 2 whitespaces in Java

I basically want to check if a string is formatted exactly as: "WORD1 WORD2 WORD3", where WORD1, WORD2, and WORD3 are any arbitray words. In short, I'm trying to check if a string contains exactly two whitespaces and exactly three words; no numbers and no symbols other than regular letters.

I've looked extensively at other posts regarding regex in Java but none of those posts seem to say how to match exactly n whitespaces. Similar posts are this, this, and but they only seem to explain how to find strings that only contain whitespaces or if they contain any whitespaces.

I looked at the Pattern class Java documentation on how to match spaces and it says that this: [ \t\n\x0B\f\r] matches "a whitespace character", which I believe includes the space, tab, newline, , form-feed, and carriage return characters.

But when I implement the code in Java, I don't get what I expect:

import java.util.regex.Pattern;

public class WhiteSpace{
    public static void main(String[] args) {
        boolean b = Pattern.matches("[ \\t\\n\\x0B\\f\\r]", "word word word"); 
        System.out.println(b); // This prints false instead of true even though there are 2 spaces in the string.
    }
}

Even trying just "[ ]" or "\\s" doesn't seem to work. I don't have any luck with quantifiers either, such as x{2}? (to match x exactly twice). And the baffling thing is that when I try out the same thing on a regex tester website (such as regex101.com), I do indeed get the 2 matches that I want.

Some feedback would be appreciated!

Upvotes: 0

Views: 700

Answers (2)

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521053

I would use String#matches here, with the following regex pattern:

\S+\s\S+\s\S+

Sample script:

String input = "WORD1 WORD2\tWORD3";
if (input.matches("\\S+\\s\\S+\\s\\S+")) {
    System.out.println("MATCH");
}

The above pattern should work for 3 words with exactly two whitespace characters, because there is no other way to arrange the 3 words to achieve this requirement.

Edit:

If you want to only admit "regular" letters in the three words, then use:

(?i)[A-Z]+\s[A-Z]+\s[A-Z]+

Upvotes: 3

OscarRyz
OscarRyz

Reputation: 199215

Split the string and test each part.

var count = 0;
for (var s : input.split(" ")) {
  if (s.matches("[a-zA-Z]+")) {
    count++;
  } else {
    return false;
  }
}
return count == 3;

It does work:

https://repl.it/repls/TruthfulLuxuriousOmnipage#Main.java

Upvotes: 0

Related Questions