How to replace all numbers, except specific numbers in python?

Question

I have a list of strings, some of them are containing numbers.

for instance:

abc123 def3464
hello32 goodbye64
some numbers 1254324

I want to clean up all numbers in those strings, but keep specific numbers, such as 32 and 64, so the clean up will return this:

abc def
hello32 goodbye64
some numbers

Note, that in the first example (def3464) the number 64 exists, but not alone, therefore it should be eliminated.

Any ideas?

Valdi_Bo · Accepted Answer

You can do the task even without lambdas, relying solely on the regex capabilities (although the regex is more complicated).

The regex needed is: (?:(32|64)|\d+)(?=\D|$). Details:

(?: - Start of the non-capturing group, needed as a container for alternatives.
(32|64) - The first alternative (and capturing group), either 32 or 64.
| - Or.
\d+ - The second alternative, a sequence of digits.
) - End of the non-capturing group.
(?=\D|$) - The (common) ending part (after both alternatives) - positive lookup for either a non-digit char or end of string.

The first alternative (and capturing group) matches either 32 or 64 and the second alternative (without capturing group) matches any other number.

The replacement expression is \1 (replace the match with the content of the first capturing group).

So, if the second alternative matched, the first group matched nothing, hence nothing is put as the replacement for the current match.

To demonstrate how it works, run the example program:

import re
src = ['abc123 def3464', 'hello32 goodbye64', 'some numbers 1254324']
print(src)
result = [re.sub(r"(?:(32|64)|\d+)(?=\D|$)", r"\1", i) for i in src]
print(result)

If you are unhappy with the trailing space in the last output string, add .strip() after re.sub(...).

How to replace all numbers, except specific numbers in python?

Answers (2)

Related Questions