derekvan
derekvan

Reputation: 27

javascript regex for finding text several lines before the match

I am trying to create a regex that finds text in a markdown file. Basically, I have "tasks" marked with the - [ ] or - [x] characters (undone or done) and project headers (marked with ##). I would like to find all undone tasks and their project names.

For example, for this sample text:

# Top of File

## Project A
Descriptive line

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec finibus elit non nibh lobortis molestie.

- [ ] an undone task
- [x] a completed task
- [x] second completed task

## Project B
Descriptive line

- [x] a completed task
- [ ] an uncompleted task

## Project C
Descriptive line

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec finibus elit non nibh lobortis molestie.

- [x] completed task
- [ ] uncompleted task
- [x] completed task

I would like to return:

Project A, an undone task
Project B, an uncompleted task
Project C, uncompleted task

This maybe gets close, but I will have variable amounts of tasks and the regex wants to know how many lines to look ahead, and it's too variable. ((.*(\n|\r|\r\n)){5})\- \[ \]

Upvotes: 1

Views: 95

Answers (1)

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521457

We can try using match() here to alternatively find the project or incomplete lines. Then, do a reduction to combine the two matching lines by a comma separator.

var input = `# Top of File

## Project A
Descriptive line

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec finibus elit non nibh lobortis molestie.

- [ ] an undone task
- [x] a completed task
- [x] second completed task

## Project B
Descriptive line

- [x] a completed task
- [ ] an uncompleted task

## Project C
Descriptive line

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec finibus elit non nibh lobortis molestie.

- [x] completed task
- [ ] uncompleted task
- [x] completed task`;

var lines = input.match(/## (.*)|- \[ \] (.*)/g)
                 .map(x => x.match(/\w+(?: \w+)*/g)[0]);
var output = [];
var i=0;
while (i < lines.length) {
    output.push(lines[i] + ", " + lines[i+1]);
    i += 2;
}

console.log(output);

Here is an explanation of the regex pattern used to find the matching lines:

  • ## (.*) match and capture the project text
  • | OR
  • - \[ \] (.*) match and capture the incomplete text

But the match() function will return the leading portion (e.g. ##) which we don't want. So I added a map() step which removes this leading content. Finally, we iterate the array of lines and combine in order with a comma.

Upvotes: 1

Related Questions