Christian
Christian

Reputation: 5531

Regex to find and remove newlines between Jekyll frontmatter (---)

Started looking into regular expressions and I know some basics but now I tried to remove category and subcategory from a lot of Jekyll markdown files using regex. I succeeded to remove the two category lines but have some blank lines (\n) left.

I would like to remove the blank lines between the hyphens (---).

The current state looks like that (3 possible states):

---
title: safdsdf


---

or

---
title: safdsdf


toc: 
---

or

---
title: safdsdf


toc:
redirect_from: 
---

Using (?!-\n)(\w) I find the text but I am not able to get only the newlines. (?!-{3}\n)(\n\n) give me all newlines all over the document. Regex are super-hard for me, any help is really appreciated.

Upvotes: 0

Views: 448

Answers (2)

The fourth bird
The fourth bird

Reputation: 163477

For those 3 possible states, you could capture the beginning --- and the following non empty lines in a capturing group.

Then match the whitespace chars starting on a new line and capture in group 2 the non empty lines until the ending ---

^(---(?:\r?\n(?!--|\s*$).*)*)\s*((?:\r?\n(?!---).*)*\r?\n---)$
  • ^ Start of string
  • ( Capture group 1
    • --- Match literally
    • (?: Non capturing group
      • \r?\n match a newline
      • (?!(?:---|\s*)$) If what is on the right is not --- or only whitespace chars
      • .* Match any char except a newline 0+ times
    • )* Close non capturing group and repeat 0+ times
  • ) Close group 1
  • \s* Match 0+ times a whitespace char
  • ( Capture group 2
    • (?: Non capturing group
      • \r?\n Match a newline
      • (?!---$) If what is on the right is not ---
      • .* match any char except a newline 0+ times
    • )* Close no capture group and repeat 0+ times
  • \r?\n--- Match a newline and ---
  • ) Close group 2
  • $ End of string.

Regex demo

In the replacement use the 2 capturing groups

$1$2

Upvotes: 1

user11809641
user11809641

Reputation: 895

Try this:

Search: (\n){2,}

Replace: \n

\n is line break

(){2,} means two or more times (the parentheses just encapsulates multiple characters so that the curly braces work.

Depending on your OS, you may need to adjust the line break character. For example, Windows uses \r\n

Upvotes: 1

Related Questions