How to find all the strings with length 5 and have 1 digit and 4 letters divided to all group combinations

Question

I need regex to count all the groups of strings with length of 5 that contains 1 digit (0-9) and 4 small letters (a-z) with the following:

1 digit and all letters are different
For example: 1abcd
1 digit, 2 letters are equal and the rest are different
For example: a2acd
1 digit, 3 letters are equal and the rest are different
For example: aa3ad
1 digit, 4 letters are equal
For example: aa5aa
1 digit, 2 letters are equal and two different other letters are equal
For example: 1aabb

I know how to match all the strings with length of 5 with letters and 1 digit:
^(?=.{5}$)[a-z]*(?:\d[a-z]*){1}$
Here is an example.

But I don't how to do it for each of the above groups.
I read that for the first example (1 digit and all letters are different) I need to prevent from a repeating char with .*(.).*\1 but I tried:

^(?=.{5}$)[a-z]*(?:\d[a-z]*)(.*(.).*\1){1}$

It didn't work.

dawg · Accepted Answer

You can use:

/\b(?=[a-zA-Z]*\d[a-zA-Z]*)([a-zA-Z0-9]{5})/

Demo

Add a second \b to reject matching strings longer than 5 characters:

/\b(?=[a-zA-Z]*\d[a-zA-Z]*)([a-zA-Z0-9]{5}\b)/

Demo 2

If you then want to limit to lower case letters:

/\b(?=[a-z]*\d[a-z]*)([a-z0-9]{5}\b)/

Since all combos of the four letters are possible, no further classification is necessary. All the same, all different, some the same.

If you DO want to classify the letters, just capture in Python and add the logic desired.

Based on your example (which it would be helpful to state what is and is not a match for the goal of this question):

/(?=^[a-z]*\d[a-z]*$)(^[a-z0-9]{5}$)/mg

Demo 3

Then if you want to classify into groups, I would just do that in Python:

import re 

st='''\
1aaaa
2aabb
jwzw3
jlwk6
bjkgp
5fm8s
x975t
k88q5
zl796
qm9hb
h6gtf
9rm9p'''

di={}
for m in re.finditer(r'(?=^[a-z]*\d[a-z]*$)(^[a-z0-9]{5}$)', st, re.M):
    di.setdefault(len(set(m.group(1)))-1, []).append(m.group(1))

>>> di
{1: ['1aaaa'], 2: ['2aabb'], 3: ['jwzw3'], 4: ['jlwk6', 'qm9hb', 'h6gtf']}

How to find all the strings with length 5 and have 1 digit and 4 letters divided to all group combinations

Answers (1)

Related Questions