labjunky
labjunky

Reputation: 831

How can I customize what characters are filtered out using string.punctuation?

I have a string with which I would like to remove all punctuation. I currently use:

import string
translator = str.maketrans('','', string.punctuation)
name = name.translate(translator)

However, for strings which are names this removed the hyphen also, which I would like to keep in the string. For Instance '\Fred-Daniels!" Should become "Fred-Daniels".

How can I modify the above code to achieve this?

Upvotes: 2

Views: 1905

Answers (3)

Sachin Rastogi
Sachin Rastogi

Reputation: 477

import string

PUNCT_TO_REMOVE = string.punctuation
print(PUNCT_TO_REMOVE) # Output : !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~

# Now suppose you don't want _ in your PUNCT_TO_REMOVE

PUNCT_TO_REMOVE = PUNCT_TO_REMOVE.replace("_","")
print(PUNCT_TO_REMOVE) # Output : !"#$%&'()*+,-./:;<=>?@[\]^`{|}~

Upvotes: 3

Brian
Brian

Reputation: 1998

Depending on the use case, it could be safer and clearer to explicitly list the valid characters:

>>> name = '\\test-1.'
>>> valid_characters = 'abcdefghijklmnopqrstuvwxyz1234567890- '
>>> filtered_name = ''.join([ x for x in name if x.lower() in valid_characters ])
>>> print(filtered_name)
test-1

Note that many people have names that include punctuation though, like "Mary St. Cloud-Stevens", "Jim Chauncey, Jr.", etc.

Upvotes: 1

Chris
Chris

Reputation: 22963

If you'd like to exclude some punctuation characters from string.puncation, you can simply remove the ones you don't want considered:

>>> from string import punctuation
>>> from re import sub
>>> 
>>> string = "\Fred-Daniels!"
>>> translator = str.maketrans('','', sub('\-', '', punctuation))
>>> string
'\\Fred-Daniels!'
>>> string = string.translate(translator)
>>> string
'Fred-Daniels'

Note if it's only one or two characters you want to exclude, you should use str.replace. Otherwise, its best to just stick with re.sub.

Upvotes: 8

Related Questions