A D
A D

Reputation: 809

Regex not matching words delimited by whitespace

I have an input string that will follow the pattern /user/<id>?name=<name>, where <id> is alphanumeric but must start with a letter, and <name> is a letter-only string that can have multiple spaces. Some examples of matches would be:

/user/ad?name=a a
/user/one111?name=one ONE oNe
/user/hello?name=world

I came up with the following regex:

String regex = "/user/[a-zA-Z]+\\w*\\?name=[a-zA-Z\\s]+";

All of the above examples match the regex, but it only looks at the first word in <name>. Shouldn't the sequence \s allow me to have white spaces?

The code that I made to test what it is doing is:

String regex = "/user/[a-zA-Z]+\\w*\\?name=[a-zA-Z\\s]+";
// Check to see that input matches pattern
if(Pattern.matches(regex, str) == true){
   str = str.replaceFirst("/user/", "");
   str = str.replaceFirst("name=", "");
   String[] tokens = str.split("\\?");
   System.out.println("size = " + tokens.length);
   System.out.println("tokens[0] = " + tokens[0]);
   System.out.println("tokens[1] = " + tokens[1]);
} else
    System.out.println("Didn't match.");

So for example, one test might look like:

/user/myID123?name=firstName LastName
size = 2
tokens[0] = myID123
tokens[1] = firstName

whereas the desired output would be

tokens[1] = firstName LastName

How can I change my regex to do this?

Upvotes: 1

Views: 1706

Answers (3)

aioobe
aioobe

Reputation: 420971

Not sure what you think is the problem in your code. tokens[1] will indeed contain firstName LastName in your example.

Here's an ideone.com demo showing this.


However, have you considered using capturing groups for the id and the name.

If you write it like

String regex = "/user/(\\w+)\\?name=([a-zA-Z\\s]+)";

Matcher m = Pattern.compile(regex).matcher(input);

you can get hold of myID123 and firstName LastName through m.group(1) and m.group(2)

Upvotes: 3

Matt
Matt

Reputation: 17629

The problem is that * is greedy by default (it matches the whole string), so you need to modify your regex by adding a ? (making it reluctant):

    List<String> str = Arrays.asList("/user/ad?name=a a", "/user/one111?name=one ONE oNe", "/user/hello?name=world");
    String regex = "/user/([a-zA-Z]+\\w*?)\\?name=([a-zA-Z\\s]+)";

    for (String s : str) {
        Matcher matcher = Pattern.compile(regex).matcher(s);
        if (matcher.matches()) {
            System.out.println("user: " + matcher.group(1));
            System.out.println("name: " + matcher.group(2));
        }
    }

Output:

user: ad
name: a a
user: one111
name: one ONE oNe
user: hello
name: world

Upvotes: 1

Prince John Wesley
Prince John Wesley

Reputation: 63698

I don't find any fault in your code but you may capture group like this:

    String str = "/user/myID123?name=firstName LastName ";      
    String regex = "/user/([a-zA-Z]+\\w*)\\?name=([a-zA-Z\\s]+)";
    Pattern p = Pattern.compile(regex);
    Matcher m = p.matcher(str);
    if(m.find()) {
        System.out.println(m.group(1) + ", " + m.group(2));
    }

Upvotes: 1

Related Questions