Tobi
Tobi

Reputation: 884

RegEx: Grabbing paragraphs in markdown

I have a markdown file and want to grap the whole text between the first two subheadlines with line breakings and other subsubheadlines.

This is the given markdown

# main stuff

random text

## sub stuff


### subsub stuff

* bla bla
* bla bla


### subsub2

* **bold stuff:** blabla ([#11](https://blalba)) 
* **bold stuff:** blabla ([#11](https://blalba)) 
* **bold stuff:** blabla ([#11](https://blalba)) 


## sub staff 2

### subsub

* blaa
* blaa

## sub staff 3

### blaaa

* blubb
* **bold stuff:** blabla ([#11](https://blalba)) 

## sub staff 4

### subsub

* blaa
* blaa


## sub staff 5

### blaaa

* blubb

I want the part between the first two ##. So in this example I want the following:

## sub stuff


### subsub stuff

* bla bla
* bla bla


### subsub2

* **bold stuff:** blabla ([#11](https://blalba)) 
* **bold stuff:** blabla ([#11](https://blalba)) 
* **bold stuff:** blabla ([#11](https://blalba)) 

What I tried

  1. ## ([^## ]*)## but this does not contain the line breaks
  2. ## [\s\S]*## but this contains all characters until the last ## in the file

I need a combination something like ## ([\s\S^## ]*)##, but yes, this is not valid in the way I need it.

Upvotes: 1

Views: 264

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627082

It looks like you may use

(?s)## .*?(?=\s*\n## |$)

See the regex demo

The pattern matches

  • (?s) - a DOTALL modifier that makes . match line break chars
  • ## - a literal string
  • .*? - any 0+ chars, as few as possible
  • (?=\s*\n## |$) - a location that is immediately followed with 0+ whitespaces, newline, and then ## , or the end of the string.

Upvotes: 1

Related Questions