LppEdd
LppEdd

Reputation: 21172

RegExp to match conventional commit syntax

I've been fiddling around this since yesterday.
I just can't seem to match all the possible cases.

I'm trying to come up with a regular expression which matches a Conventional Commit, but which also offers some error recovery functionality.

Current regexp:

(?<type>build)(?<scope>\(.*\)?(?=:))?(?<breaking>!)?(?<subject>:.*)?

Inputs:

build(one)
build(two)!
build(three)!:test
build(example:module)!: test
build: test
build(<> : dda!sd): test
build(:
build

Outputs:

enter image description here

What doesn't work:


The sample is at Regex101, https://regex101.com/r/XYC04q/1
And I have other (16) tests here, https://regex101.com/r/sSrvyA/11

Even if you have no time to try and modify it, any comment is appreciated.

Upvotes: 17

Views: 13810

Answers (7)

cremedekhan
cremedekhan

Reputation: 67

Recently I took a stab at this too and based off of a few answers here, some trial and error, and other assistance, I came up with the following that mostly covers the requirements as laid out in https://www.conventionalcommits.org/en/v1.0.0/#specification. The only situation my regex does not cover is should your commit message have multiple footers, capturing any footers after the first.

Regex:

Initial commit|Merge [^\r\n]+|(?:(?<type>build|chore|ci|docs|feat|fix|perf|refactor|revert|style|test|BREAKING CHANGE)(?<scope>\(\w+\))?(?<breaking_change>!?): (?<summary>[\w -]+))(?<=\v\v){0,2}(?<body>[\w\s-]+)(?<footer>(?<=\v\v)(?<footer_token>[\w-]+): (?<footer_value>[\w -]+)|$)

Try it out here: https://regex101.com/r/VG2n9I/1


After revisiting this and comparing notes with other answers here I came up with another solution. This solution makes the regex more accessible to various regex implementations by replacing the variable length look-around with an optional non-capturing group. It also includes the multiple footer capturing courtesy of Ahmed Khamal's answer

Initial commit|Merge [^\r\n]+|(?:(?P<type>build|chore|ci|docs|feat|fix|perf|refactor|revert|style|test|BREAKING CHANGE)(?:\((?P<scope>[\w.-]+)\))?(?P<breaking_change>!?): (?P<summary>[\x20-\x39\x3B-\x7E]+))(?:\v\v)?(?P<body>[\x20-\x39\x3B-\x7E\s]+)(?P<footer>(?<=\v\v)(?:(?P<footer_token>[\w\s-]+): (?P<footer_value>[\w -`]+))+|$)

Upvotes: 2

Ahmed Kamal
Ahmed Kamal

Reputation: 136

I wrote this based on the answer of cremedekhan, and it covers multiple footer entries for the mapping from footer_token to footer_value.

(?<initial_commit>^Initial commit\.?)|(?<merge>^Merge [^\r\n]+)|(?<type>^build|chore|ci|docs|feat|fix|perf|refactor|revert|style|test|¯\\_\(ツ\)_\/¯)(?:\((?<scope>[\w-]+)\))?(?<breaking>!)?: (?<summary>[\w ,'.`:-]+)(?<=\v\v){0,2}(?<body>[\w\s ,'.`\[\]-]+)(?<footer>(?<=\v\v)(?:(?<footer_token>[\w\s-]+): (?<footer_value>[\w -`]+))+|$)

Try it out here: https://regex101.com/r/JB5nb8/1

Upvotes: 1

Zach Bonfil
Zach Bonfil

Reputation: 331

I use this:

^(build|chore|ci|docs|feat|fix|perf|refactor|revert|style|test)(\((.*?)\))?: (.*?)$

https://www.regextester.com/109925

Upvotes: 1

Ders
Ders

Reputation: 1770

To build off of codeangler's answer, here is a robust regex I created that additionally satisfies 6. and 7. of the Conventional Commits 1.0.0 Specification, regarding body and footer paragraphs and their line breaks.

\A(((Initial commit)|(Merge [^\r\n]+)|((build|chore|ci|docs|feat|fix|perf|refactor|revert|style|test)(\(\w+\))?!?: [^\r\n]+((\r|\n|\r\n)((\r|\n|\r\n)[^\r\n]+)+)*))(\r|\n|\r\n)?)\z

Notes:

  • line break characters are allowed within body paragraphs to allow for breaking long lines
  • a trailing line break is allowed and can be disallowed by removing the final (\r|\n|\r\n)?
  • the 'Merge borg' part can be removed if using a fast-forward git workflow with no merge commits
  • the 'not a line break' parts [^\r\n] can be replaced with . if the engine does not / is not set to match line breaks with .
  • other parts of the specification such as the elaborate footer content requirements are not addressed

Try it out:

https://regex101.com/r/llDgcv/1

Upvotes: 2

Daniel Gomez Rico
Daniel Gomez Rico

Reputation: 15955

This one will validate a conventional commit but it does not break it into groups:

^(build|chore|ci|docs|feat|fix|perf|refactor|revert|style|test)(\(.*\))?: .*$

Original source: https://www.regextester.com/109925

Upvotes: 4

codeangler
codeangler

Reputation: 877

Doesn't necessarily solve the desire to capture )!: as breaking group but this does seem to follow conventional commits specs 1-5 + 13.

^(?<type>build|chore|ci|docs|feat|fix|perf|refactor|revert|style|test|¯\\_\(ツ\)_\/¯)(?<scope>\(\w+\)?((?=:\s)|(?=!:\s)))?(?<breaking>!)?(?<subject>:\s.*)?|^(?<merge>Merge \w+)

https://regex101.com/r/XYC04q/11

Gitlab Push Commit Regex

However, if you are using a platform like GitLab and want to set push rule to commit messages as of v13(?) they are using re2 standards and Golang parser.

Here it is for that Gitlab.

Note that GitLab enables the global flag (?m) which gave me some difficulty. Discussion on gitlab

| Restrict by commit message | Starter 7.10 | Only commit messages that match this regular expression are allowed to be pushed. Leave empty to allow any commit message. Uses multiline mode, which can be disabled using (?-m). |

source doc

## simplified and modified for gitlab's 
^((build|chore|ci|docs|feat|fix|perf|refactor|revert|style|test|¯\\_\(ツ\)_/¯)(\(\w+\))?(!)?(: (.*\s*)*))|(Merge (.*\s*)*)|(Initial commit$)

## RE2 compliment but doesn't work on GitLab at this time
^(?P<type>build|chore|ci|docs|feat|fix|perf|refactor|revert|style|test|¯\\_\(ツ\)_/¯)(?P<scope>\(\w+\))?(?P<breaking>!)?(?P<subject>:\s.*)?|^(?P<merge>Merge \w+)

https://regex101.com/r/XYC04q/28

GitLab using Terraform Push Rule Regex

Note if you are trying to use Terraform for Gitlab with Regex, note that Terraform parses the string prior to Gitlab requiring some bonus escapes.

resource "gitlab_project_push_rules" "github_flow" {
  project = gitlab_project.project.id

  # Conventional Commits https://www.conventionalcommits.org/en/v1.0.0/
  commit_message_regex = "^((build|chore|ci|docs|feat|fix|perf|refactor|revert|style|test|¯\\\\_\\(ツ\\)_/¯)(\\(\\w+\\))?(!)?(: (.*\\s*)*))|(Merge (.*\\s*)*)|(Initial commit$)"
  branch_name_regex    = "(^(build|chore|ci|docs|feat|fix|perf|refactor|revert|style|test|¯\\\\_\\(ツ\\)_\\/¯)\\/[a-z0-9\\-]{1,55}$)|master"

  prevent_secrets = true
}

Upvotes: 28

The fourth bird
The fourth bird

Reputation: 163632

You have some optional parts for which you could indeed a non capturing group to match either from an opening ( till a closing ) or match only an opening (

(?<type>build)(?<scope>(?:\([^()\r\n]*\)|\()?(?<breaking>!)?)(?<subject>:.*)?
  • (?<type>build) Group type, match build
  • (?<scope> Group scope
    • (?: Non capturing group
      • \([^()\r\n]*\) Match either from opening ( till closing )
      • | or
      • \( Match a single (
    • )? Close non capturing group and make it optional
    • (?<breaking>!)? Optional group breaking
  • ) Close group scope
  • (?<subject>:.*)? Optional group subject

Regex demo

Upvotes: 7

Related Questions