Reputation: 6051
I'm trying to match some legal documents links. I've gone fare enough but I think I'm missing something. This is my work for now:
(\d( )?)?(([[a-zA-Z]\.])+?) ([0-9]+?)\b:([0-9]+?)?\b
I have a base construction witch I can match:
? = optional
number/space?/string/space/number/:/number
But now I want to optionally match any combination of the fallowing:
-/number
,/space/number
,/space/number/-/number
This is my best match:
(\d( )?)?(([[a-zA-Z]\.])+?) ([0-9]+?)\b:([0-9]+?)(, [0-9]+?)?(-[0-9]+?)?(, ([0-9]+?)-([0-9]+?)?)?\b
I can match this:
8 Law 84:145, 252-320
But not this:
8 Law 84:145, 252-320, 458, 517-665
Upvotes: 1
Views: 101
Reputation: 626738
You may use
(\d+)\s*([a-zA-Z]+)\s+(\d+):(\d+)((?:-\d+|,\s\d+(?:-\d+)?)*)
See the regex demo
The main part I added is ((?:-\d+|,\s\d+(?:-\d+)?)*)
that matches and captures into a group 0 or more sequences of:
-\d+
- a hyphen and 1+ digits|
- or,\s\d+(?:-\d+)?
- comma, whitespace, 1+ digits, and then an optional sequence of -
and 1+ digits.Do not forget to double backslashes in the Java string literal inside the code.
Upvotes: 1