Ramon Hallan
Ramon Hallan

Reputation: 125

Removing white spaces and punctuation from list

def wordlist (l: list) -> list:
    '''Returns a wordlist without white spaces and punctuation'''
    result = []
    table = str.maketrans('!()-[]:;"?.,', '            ')
    for x in l:
        n = x.translate(table)
        n = x.strip()
        n = x.split()
        if n != []:
            result.extend(n)
    return result

The function is supposed to work like this:

print(wordlist(['  Testing', '????', 'function!!']))

should yield:

['Testing', 'function']

but the code I have above yields:

['Testing', '??', 'function!!']

So I assume I'm doing something incorrectly with the code in regards to removing punctuation--where should I fix it? Any other suggestions to simplify the code would also be appreciated (since I figured it's a bit long-winded).

Upvotes: 0

Views: 903

Answers (2)

vaultah
vaultah

Reputation: 46533

Did you mean to chain translate(table), strip() and split() calls?

Then

n = x.translate(table)
n = x.strip()
n = x.split()

should be

n = x.translate(table)
n = n.strip() # change x to n
n = n.split() # same here

or

n = x.translate(table).split()

No need for intermediate strip().

As for further simplification, you don't have to check the emptiness of n, it looks like a premature optimization to me:

if n != []: # you can remove this line
    result.extend(n)

The result:

def wordlist (l: list) -> list:
    '''Returns a wordlist without white spaces and punctuation'''
    result = []
    table = str.maketrans('!()-[]:;"?.,', '            ')
    for x in l:
        result.extend(x.translate(table).split())
    return result

You can even replace that loop with a list comprehension.

Upvotes: 1

tzaman
tzaman

Reputation: 47780

Probably a lot cleaner to just use re.sub here:

import re
clean = re.compile(r'[!()\-\[\]:;"?.,\s]')

words = ['  Testing', '????', 'function!!']
result = list(filter(bool, (clean.sub('', w) for w in words)))
print result
# ['Testing', 'function']

Upvotes: 0

Related Questions