KJQ
KJQ

Reputation: 707

Java Regex to Parse a Path Into Multiple Optional Groups

I am trying to split this type string using a Java Regex:

/api/vX/client/domain/category/id

crudely into this:

(?:/api)?(?:/vX)?(/client/domain/...)?(?:/category)?(?:...)?

I would split it into the following groups:

Right now, I am trying to us a regex like this but it is just not working the way I am expecting it to.

(\/api)?(\/v\d+)?(\/\w+)(\/category1|category2\/?.*)?

I also need to take into account trailing/leading slashes with the expectation a leading slash will always start a segment but a trailing slash may or may not be there (unless there is a next segment).

Some example of paths and outputs I am trying to achieve are:

/client: 
[0], [1], [2]=/client, [3], [4]

/api/client: 
[0]=/api, [1], [2]=/client, [3], [4]

/api/v1/client/domain: 
[0]=/api, [1]=/v1, [2]=/client, [3], [4]

/api/v1/client/domain/category1: 
[0]=/api, [1]=/v1, [2]=/client/domain, [3]=/category1, [4]

api/v1/client/d1/d2/d3/category1: 
[0]=/api, [1]=/v1, [2]=/client/d1/d2/d3, [3]=/category1, [4]

/api/v2/client/domain/category2/id: 
[0]=/api, [1]=/v2, [2]=/client/domain, [3]=/category2, [4]=/id

Upvotes: 1

Views: 1048

Answers (1)

Mariano
Mariano

Reputation: 6511

The following regex will match what you defined:

 ^(/api)?(/v\d+)?(/[^/]+(?:/[^/]+)*?)??(?:(/category[12])(/.*)?)?$
  • ^ matches the start of line
  • (/api)? group 1 (optional)
  • (/v\d+)? group 2 (optional)
  • (/[^/]+(?:/[^/]+)*?)?? group 3 matches any number of groups, for client, domain, etc. (optional)
    • Both the outter and the inner group have a lazy quantifier, to allow a match in categories.
    • [^/]+ is a character class that matches anything except slashes.
  • (?:(/category[12])(/.*)?)? is an optional non capturing group that matches:
    • (/category[12]) category1 or 2 in group 4
    • (/.*)? group 5: anything else (optional)
  • $ the end of string (this is important to force lazy matches to capture all the text)

Code

String text = "/api/v2/client/domain/category2/id";
String pattern = "^(/api)?(/v\\d+)?(/[^/]+(?:/[^/]+)*?)??(?:(/category[12])(/.*)?)?$";
Pattern regex = Pattern.compile(pattern, Pattern.CASE_INSENSITIVE);
Matcher m = regex.matcher(text);

while (m.find())
{
    System.out.println("api: " + m.group(1) + 
                       "\nversion: " + m.group(2) +
                       "\nclient: " + m.group(3) +
                       "\ncategory: " + m.group(4) +
                       "\nextra: " + m.group(5));
}

Output

api: /api
version: /v2
client: /client/domain
category: /category2
extra: /id

ideone demo

Upvotes: 4

Related Questions