giodamelio
giodamelio

Reputation: 5605

Removing Punctuation From Python List Items

I have a list like

['hello', '...', 'h3.a', 'ds4,']

this should turn into

['hello', 'h3a', 'ds4']

and i want to remove only the punctuation leaving the letters and numbers intact. Punctuation is anything in the string.punctuation constant. I know that this is gunna be simple but im kinda noobie at python so...

Thanks, giodamelio

Upvotes: 17

Views: 80614

Answers (6)

Guy Lifshitz
Guy Lifshitz

Reputation: 11

Just be aware that string.punctuation works in English, but may not work for other languages with other punctuation marks.

You could add them to a list LIST_OF_LANGUAGE_SPECIFIC_PUNCTUATION and then concatenate it to string.punctuation to get a fuller set of punctuation.

punctuation =  string.punctuation + [LIST_OF_LANGUAGE_SPECIFIC_PUNCTUATION]

Upvotes: 1

florex
florex

Reputation: 963

In python 3+ use this instead:

import string
s = s.translate(str.maketrans('','',string.punctuation))

Upvotes: 3

Josh Bleecher Snyder
Josh Bleecher Snyder

Reputation: 8432

Use string.translate:

>>> import string
>>> test_case = ['hello', '...', 'h3.a', 'ds4,']
>>> [s.translate(None, string.punctuation) for s in test_case]
['hello', '', 'h3a', 'ds4']

For the documentation of translate, see http://docs.python.org/library/string.html

Upvotes: 9

Mark Byers
Mark Byers

Reputation: 838156

Assuming that your initial list is stored in a variable x, you can use this:

>>> x = [''.join(c for c in s if c not in string.punctuation) for s in x]
>>> print(x)
['hello', '', 'h3a', 'ds4']

To remove the empty strings:

>>> x = [s for s in x if s]
>>> print(x)
['hello', 'h3a', 'ds4']

Upvotes: 28

Ant
Ant

Reputation: 5414

import string

print ''.join((x for x in st if x not in string.punctuation))

ps st is the string. for the list is the same...

[''.join(x for x in par if x not in string.punctuation) for par in alist]

i think works well. look at string.punctuaction:

>>> print string.punctuation
!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~

Upvotes: 2

Rafe Kettler
Rafe Kettler

Reputation: 76955

To make a new list:

[re.sub(r'[^A-Za-z0-9]+', '', x) for x in list_of_strings]

Upvotes: 1

Related Questions