Reputation: 2473
I want to parse some nested markdwon lists, like below:
* elem 1
* elem 2
* child 1
* child 2
* child 1
* elem 3
* child 1
The list nesting are tabbed. So each level has n tabs.
I'm searching for a regex which can give me each level, e.g. Level 3 has \t\t
, Level 2 has only \t
, Level 1 has no tab, but all starting with *
.
How can I match theses requires with different regexp?
One try for the Level 1 elements was:
^(?=\*).*
But this selects only the first element of Level 1 (e.g. elem 2 and elem 3 are not found).
BR,
mybecks
Upvotes: 1
Views: 2242
Reputation: 1899
If I understand you correctly you want this:
/^\*.*?(?=^\*|\Z)/sm
Basically it means match from beginning of line, match literally *
then anything non-greedily up to the but not including the next ^\*
or EOF
EDIT:
This wont work for you, as javascript doesn't support \Z
, oops had the wrong regex engine flavour enabled, will update shortly :)
EDIT 2:
This should work in javascript:
^\*[^]+?(?=^\*)|^\*[^]+
Had to use an alternation for the very last element ie if you remove |^\*[^]+
from the end of the regex it wont match the last element :(.
Upvotes: 1
Reputation: 27843
Here is a function that returns a regexp (based on yours) for matching all the elements on a certain level:
function getNestedRegexp(level) {
return new RegExp('^(?=\\t{'+level+'}\\*).*','gm');
}
// Usage:
var elements = str.match(getNestedRegexp(1)); // all elements on level 1
DEMO: http://jsbin.com/EcAKIza/1/edit
As others have mentioned, regexp may not be the best solution here, so be careful if you pick this option.
EDIT: I am not sure why you are using a positive lookahead there. A better regexp could be:
/^\t{N}\*.*/gm
DEMO & EXPLANATION: http://regex101.com/r/rZ7mD1
Upvotes: 1