Reputation: 71
I would like to get words having even number(excluding 0) of letter A in a string. For example, I have a string "aa abs aadfc asdacx adds asdwe", then the output should be ['aa', 'aadfc', 'asdacx'] using re.findall.
I write the regex in this way: pattern = r'\b[^A]*[(A[^A]*A[^A]*)+]\b'
. However, the output is very strange. For example, re.findall(pattern, 'eeee')
only return 'ee'. I guess the problem is from the parenthesis, any one can help me out?
Upvotes: 1
Views: 138
Reputation: 785721
You may use this regex for this job:
\b(?:(?:[^a\W]*a){2})+[^a\W]*\b
RegEx Breakup:
\b
: Word boundary(?:
: Start non-capture group 1
(?:
: Start non-capture group 2
[^a\W]*
: Match 0 or more of any char that is not a
and not a not-worda
: Match a
){2}
: End non-capture group 2. Repeat this group exactly 2 times)+
: End non-capture group 1. Repeat this group 1+ times[^a\W]*
: Match 0 or more of any char that is not a
and not a not-word\b
: Word boundaryCode:
import re
s = "aa abs aadfc asdacx adds asdwe bcadcapca"
print (re.findall(r'\b(?:(?:[^a\W]*a){2})+[^a\W]*\b', s))
Output:
['aa', 'aadfc', 'asdacx']
Upvotes: 2