Reputation: 55
I'm currently trying to create a regular expression for matching enumerations from law articles to apply some style-modifications for it.
Here is my current regex:
/\R([0-9a-zA-Z])(\.|\))(.*?)(\R\R|$)/gs
https://regex101.com/r/WtT0cT/1
As you can see on regex101 the problem is with the sub-enumerations in enumeration number 3.
My regex doesn't need to also get each sub-enumeration but it should get all the text which belongs to this enumeration. This means for number 3 it should get the following:
some text 3 More text in number 3
a) sub-enumeration a in 3
b) sub-enumeration b in 3
c) sub-enumeration c in 3
d) sub-enumeration d in 3
Some text which belongs to no sub-enumeration but to enumeration 3
Any idea?
Upvotes: 1
Views: 149
Reputation: 91488
\h*[0-9a-zA-Z][.)][\s\S]+?(?=\R+\d|$)
Explanation:
\h* : 0 or more horizotal spaces
[0-9a-zA-Z] : 1 alphanumeric
[.)] : dot or parenthesis
[\s\S]+? : 1 or more any character, not greedy
(?= : lookahead
\R+\d : 1 or more linebreak, followed by a digit
| : OR
$ : end of string
) : end lookahead
Upvotes: 2