innicoder
innicoder

Reputation: 2688

Pythonic Way of replacing Multiple Characters

I've created a onetime function

a = lambda x: x.replace('\n', '')
b = lambda y: y.replace('\t', '').strip()
c = lambda x: b(a(x))

Is there a Pythonic and compact way (one liner?) to this to improve readability and performance. Mainly performance.

(note: I know I can do lambda x: x.replace('\n').replace('\t\).strip() but doesn't do anything. Preferably there's a built-in method that deals with this type of issue that I wasn't aware of and I know that the performance improvements are negligible.)

Input: 'my \t\t\t test, case \ntest\n LoremIpsum'

Desired Output: 'my test, case test LoremIpsum'

Upvotes: 1

Views: 200

Answers (2)

cs95
cs95

Reputation: 402263

Option 1
str.translate
For starters, if you're replacing a lot of characters with the same thing, I'd 100% recommend str.translate.

>>> from string import whitespace as wsp
>>> '\n\ttext   \there\r'.translate(str.maketrans(dict.fromkeys(wsp, '')))
'texthere'

This syntax is valid with python-3.x only. For python-2.x, you will need to import string and use string.maketrans to build the mapping instead.

If you want to exclude whitespace chars itself, then

wsp = set(wsp) - {' '}

Option 2
re.sub
The regex equivalent of the above would be using re.sub.

>>> import re
>>> re.sub(r'\s+', '', '\n\ttext   \there\r')
'texthere'

However, performance wise, str.translate beats this hands down.

Upvotes: 2

Konstantin Sekeresh
Konstantin Sekeresh

Reputation: 138

The improvements are pretty straightforward:

Drop lambdas. str.replace() method is a function, and in the first line of your snippet you define a function that calls to another function and nothing else. Why do you need the wrapping lambda? The same concerns the second line.

Use return values. Actually, in docs we see:

Return a copy of the string with all occurrences of substring old replaced by new.

So you can do a first replace(), then do a second one on the obtained result.

To sum up, you'll have:

c = x.replace('\n', '').replace('\t', '').strip()

Note: if you have many characters to remove, you'd better use str.translate() but for two of them str.replace() is far more readable.

Cheers!

Upvotes: 1

Related Questions