Reputation: 125
def wordlist (l: list) -> list:
'''Returns a wordlist without white spaces and punctuation'''
result = []
table = str.maketrans('!()-[]:;"?.,', ' ')
for x in l:
n = x.translate(table)
n = x.strip()
n = x.split()
if n != []:
result.extend(n)
return result
The function is supposed to work like this:
print(wordlist([' Testing', '????', 'function!!']))
should yield:
['Testing', 'function']
but the code I have above yields:
['Testing', '??', 'function!!']
So I assume I'm doing something incorrectly with the code in regards to removing punctuation--where should I fix it? Any other suggestions to simplify the code would also be appreciated (since I figured it's a bit long-winded).
Upvotes: 0
Views: 903
Reputation: 46533
Did you mean to chain translate(table)
, strip()
and split()
calls?
Then
n = x.translate(table)
n = x.strip()
n = x.split()
should be
n = x.translate(table)
n = n.strip() # change x to n
n = n.split() # same here
or
n = x.translate(table).split()
No need for intermediate strip()
.
As for further simplification, you don't have to check the emptiness of n
, it looks like a premature optimization to me:
if n != []: # you can remove this line
result.extend(n)
The result:
def wordlist (l: list) -> list:
'''Returns a wordlist without white spaces and punctuation'''
result = []
table = str.maketrans('!()-[]:;"?.,', ' ')
for x in l:
result.extend(x.translate(table).split())
return result
You can even replace that loop with a list comprehension.
Upvotes: 1
Reputation: 47780
Probably a lot cleaner to just use re.sub
here:
import re
clean = re.compile(r'[!()\-\[\]:;"?.,\s]')
words = [' Testing', '????', 'function!!']
result = list(filter(bool, (clean.sub('', w) for w in words)))
print result
# ['Testing', 'function']
Upvotes: 0