Reputation: 13500
I need regex to count all the groups of strings with length of 5 that contains 1 digit (0-9
) and 4 small letters (a-z
) with the following:
1abcd
a2acd
aa3ad
aa5aa
1aabb
I know how to match all the strings with length of 5 with letters and 1 digit:
^(?=.{5}$)[a-z]*(?:\d[a-z]*){1}$
Here is an example.
But I don't how to do it for each of the above groups.
I read that for the first example (1 digit and all letters are different) I need to prevent from a repeating char with .*(.).*\1
but I tried:
^(?=.{5}$)[a-z]*(?:\d[a-z]*)(.*(.).*\1){1}$
It didn't work.
Upvotes: 2
Views: 102
Reputation: 104072
You can use:
/\b(?=[a-zA-Z]*\d[a-zA-Z]*)([a-zA-Z0-9]{5})/
Add a second \b
to reject matching strings longer than 5 characters:
/\b(?=[a-zA-Z]*\d[a-zA-Z]*)([a-zA-Z0-9]{5}\b)/
If you then want to limit to lower case letters:
/\b(?=[a-z]*\d[a-z]*)([a-z0-9]{5}\b)/
Since all combos of the four letters are possible, no further classification is necessary. All the same, all different, some the same.
If you DO want to classify the letters, just capture in Python and add the logic desired.
Based on your example (which it would be helpful to state what is and is not a match for the goal of this question):
/(?=^[a-z]*\d[a-z]*$)(^[a-z0-9]{5}$)/mg
Then if you want to classify into groups, I would just do that in Python:
import re
st='''\
1aaaa
2aabb
jwzw3
jlwk6
bjkgp
5fm8s
x975t
k88q5
zl796
qm9hb
h6gtf
9rm9p'''
di={}
for m in re.finditer(r'(?=^[a-z]*\d[a-z]*$)(^[a-z0-9]{5}$)', st, re.M):
di.setdefault(len(set(m.group(1)))-1, []).append(m.group(1))
>>> di
{1: ['1aaaa'], 2: ['2aabb'], 3: ['jwzw3'], 4: ['jlwk6', 'qm9hb', 'h6gtf']}
Upvotes: 2