Ted
Ted

Reputation: 733

Regex matching?

I have the below regular expression in Python,

^1?$|^(11+?)\1+$

Since there is a pipe '|', I will split it into 2 regex,

^1?$

For this, it should validate 1 or empty value. Am I correct?

^(11+?)\1+$

For the above regex, it would validate value of 1111. The first pair of 11 is based on (11+?) and the second pair of 11 is due to \1.

When I attempt to execute it in Python, it returns true only for 1111 but not 11 or empty value. Am I wrong somewhere?

Upvotes: 4

Views: 364

Answers (4)

Bart Kiers
Bart Kiers

Reputation: 170298

Ted wrote:

For this, it should validate 1 or empty value. Am I correct?

Yes, that is correct.

Ted wrote:

When I attempt to execute it in Python, it returns true only for 1111 but not 11 or empty value. Am I wrong somewhere?

The empty string does get matched. The following snippet:

#!/usr/bin/env python
import re

for n in xrange(0, 51):
  ones = '1' * n
  matches = re.match(r'^1?$|^(11+?)\1+$', ones)
  if matches:
    div1 = n if matches.group(1) is None else len(matches.group(1))
    div2 = 0 if div1 is 0 else len(ones)/div1
    print "[{0:2}]:{1:2} * {2:2} = '{3}'".format(n, div1, div2, ones)

will print:

[ 0]: 0 *  0 = ''
[ 1]: 1 *  1 = '1'
[ 4]: 2 *  2 = '1111'
[ 6]: 2 *  3 = '111111'
[ 8]: 2 *  4 = '11111111'
[ 9]: 3 *  3 = '111111111'
[10]: 2 *  5 = '1111111111'
[12]: 2 *  6 = '111111111111'
[14]: 2 *  7 = '11111111111111'
[15]: 3 *  5 = '111111111111111'
[16]: 2 *  8 = '1111111111111111'
[18]: 2 *  9 = '111111111111111111'
[20]: 2 * 10 = '11111111111111111111'
[21]: 3 *  7 = '111111111111111111111'
[22]: 2 * 11 = '1111111111111111111111'
[24]: 2 * 12 = '111111111111111111111111'
[25]: 5 *  5 = '1111111111111111111111111'
[26]: 2 * 13 = '11111111111111111111111111'
[27]: 3 *  9 = '111111111111111111111111111'
[28]: 2 * 14 = '1111111111111111111111111111'
[30]: 2 * 15 = '111111111111111111111111111111'
[32]: 2 * 16 = '11111111111111111111111111111111'
[33]: 3 * 11 = '111111111111111111111111111111111'
[34]: 2 * 17 = '1111111111111111111111111111111111'
[35]: 5 *  7 = '11111111111111111111111111111111111'
[36]: 2 * 18 = '111111111111111111111111111111111111'
[38]: 2 * 19 = '11111111111111111111111111111111111111'
[39]: 3 * 13 = '111111111111111111111111111111111111111'
[40]: 2 * 20 = '1111111111111111111111111111111111111111'
[42]: 2 * 21 = '111111111111111111111111111111111111111111'
[44]: 2 * 22 = '11111111111111111111111111111111111111111111'
[45]: 3 * 15 = '111111111111111111111111111111111111111111111'
[46]: 2 * 23 = '1111111111111111111111111111111111111111111111'
[48]: 2 * 24 = '111111111111111111111111111111111111111111111111'
[49]: 7 *  7 = '1111111111111111111111111111111111111111111111111'
[50]: 2 * 25 = '11111111111111111111111111111111111111111111111111'

And the input 11 is not matched because 11 is matched in group 1 ((11+?)), which should then be repeated at least once (\1+), which is not the case (it is not repeated).

Upvotes: 2

dawg
dawg

Reputation: 104092

You have a + after the \1 meaning a greedy 1 or more.

Are you trying to match 1 between 1 and 4 times?

Use:

r'^(1+){1,4}$'

The easiest is to use one of the great regex tools out there. Here is my favorite. With the same site, you can see why your regex does not work.

Here is a site that explains regex's.

Upvotes: 0

Billy Moon
Billy Moon

Reputation: 58619

I think you need more parenthesis to define what the | refers to. I would write the regex like this:

/^(1?|^(11+?)\2+)$/

note there is only one start and end used

Upvotes: 0

John Gaines Jr.
John Gaines Jr.

Reputation: 11554

If you want the second expression to match '11', '1111', '111111', etc. Use:

^(1+)\1$

Upvotes: 0

Related Questions