dan j
dan j

Reputation: 157

Repeat block of regex in regex

I'm trying to figure out a nice line of regex to match the following:

1:[any chars here except newlines]|1:[any chars here except newlines]... 

I want my regex to be able to match an infinite number of repeats of this type. The clostest I've come to figuring it out is with '(1:[^|]*\|)\1+', but it doesn't work for two reasons. Firstly, that will only find strings that have an additional pipe at the end of the string. Secondly, the text within the first capture must be the same throughout.

I could solve this using a split, but I just wondered if there was a nice way of doing this in a regular expression.

Upvotes: 2

Views: 3339

Answers (2)

Avinash Raj
Avinash Raj

Reputation: 174696

You could do like this,

^(1:[^|\n]*)(?:\|(?1))*$

DEMO

(?1) Recurses the first capturing group. Read more about recursive regex at here .

For languages which won't support recursive regex.

^(?:1:[^|\n]*)(?:\|1:[^|\n]*)*$

DEMO

Python code:

In [10]: import re

In [11]: s = """1:[any chars here except newlines]|1:[any chars here except newlines]
...: 1:[any chars here except newlines]
...: 1:foo
...: 1:foo|1:bar
...: 1:foo|1:bar|1:baz
...: 1:foo|1:bar|1:baz|1:bak
...: 1:foo|"""
In [14]: for i in re.findall(r'(?m)^(?:1:[^|\n]*)(?:\|1:[^|\n]*)*$', s):
    ...:     print(i)
    ...:     
1:[any chars here except newlines]|1:[any chars here except newlines]
1:[any chars here except newlines]
1:foo
1:foo|1:bar
1:foo|1:bar|1:baz
1:foo|1:bar|1:baz|1:bak

Upvotes: 1

Amal Murali
Amal Murali

Reputation: 76646

Apply the quantifier to the entire group:

^(?:1:[^|\n]*\|?)+(?<!\|)$

^ asserts the position at the beginning of the string. It then matches 1: followed by any characters that are not | or a newline, zero or more times (indicated by the *). This entire group can be repeated one or more times (indicated by the +). The (?<!\|) is a negative lookbehind that asserts that the last character is not a |. $ asserts position at the end of the string.

It matches all of these:

1:foo
1:foo|1:bar
1:foo|1:bar|1:baz
1:foo|1:bar|1:baz|1:bak

But will not match

1:foo|

and similar.

RegEx Demo

Upvotes: 5

Related Questions