Squall Leohart
Squall Leohart

Reputation: 677

Python Regex: why does this not work?

Why does this not work?

re.sub('\\b[a@](\\W|[a@])*[s5$](\\W|[s5$])*[s5$](\\W|[s5$])*($|\\W)', '*', '@ss')

I do not see why @ss is not replaced by *. Similarly, @55 is not replaced.

These are replaced: a55, a5s, as5, ass

Thank you!

Upvotes: 2

Views: 119

Answers (3)

Jon Clements
Jon Clements

Reputation: 142136

If you're trying a sort of "profanity" check - I would take the logic out the regex.

look_alike = {'@': 'A', '$': 'S'}
test_string = ''.join(look_alike.get(c, c) for c in your_string.upper()) # also look at `string.translate`

Then if 'ASS' in test_string - or similar with word boundaries using an re.

Upvotes: 0

Kendall Frey
Kendall Frey

Reputation: 44326

It's because @ is not a word character, and thus the first \b is not matched.

This is my suggestion:

re.sub('(\\ba|@)(\\W|[a@])*[s5$](\\W|[s5$])*[s5$](\\W|[s5$])*($|\\W)', '*', '@ss')

(Replacing \b[a@] with (\ba|@))

Upvotes: 2

RocketDonkey
RocketDonkey

Reputation: 37259

You don't have a pair of parentheses around the first section. Try this:

re.sub('(\\b[a@])*(\\W|[a@])*[s5$](\\W|[s5$])*[s5$](\\W|[s5$])*($|\\W)', '*', '@ss')

Upvotes: 0

Related Questions