littletennis
littletennis

Reputation: 71

regex: output the whole word having even number of a specific letter

I would like to get words having even number(excluding 0) of letter A in a string. For example, I have a string "aa abs aadfc asdacx adds asdwe", then the output should be ['aa', 'aadfc', 'asdacx'] using re.findall.

I write the regex in this way: pattern = r'\b[^A]*[(A[^A]*A[^A]*)+]\b'. However, the output is very strange. For example, re.findall(pattern, 'eeee') only return 'ee'. I guess the problem is from the parenthesis, any one can help me out?

Upvotes: 1

Views: 138

Answers (1)

anubhava
anubhava

Reputation: 785721

You may use this regex for this job:

\b(?:(?:[^a\W]*a){2})+[^a\W]*\b

RegEx Demo

Online Code Demo

RegEx Breakup:

  • \b: Word boundary
  • (?:: Start non-capture group 1
    • (?:: Start non-capture group 2
      • [^a\W]*: Match 0 or more of any char that is not a and not a not-word
      • a: Match a
    • ){2}: End non-capture group 2. Repeat this group exactly 2 times
  • )+: End non-capture group 1. Repeat this group 1+ times
  • [^a\W]*: Match 0 or more of any char that is not a and not a not-word
  • \b: Word boundary

Code:

import re
 
s = "aa abs aadfc asdacx adds asdwe bcadcapca"
 
print (re.findall(r'\b(?:(?:[^a\W]*a){2})+[^a\W]*\b', s))

Output:

['aa', 'aadfc', 'asdacx']

Upvotes: 2

Related Questions