Sloth87
Sloth87

Reputation: 785

How to use the isalpha function to remove special characters

I am trying to remove special characters from each element in a string. The below code does count the elements but i can't get the .isalpha to remove the non alphabetical elements. Is anyone able to assist? Thank you in advance.

input = 'Hello, Goodbye hello hello! bye byebye hello?'
word_list = input.split()

for word in word_list:
    if word.isalpha()==False:
        word[:-1]
di = dict()
for word in word_list:
    di[word] = di.get(word,0)+1

di

Upvotes: 2

Views: 3729

Answers (3)

Osman Mamun
Osman Mamun

Reputation: 2882

One solution using re:

In [1]: import re
In [2]: a = 'Hello, Goodbye hello hello! bye byebye hello?'
In [3]: ' '.join([i for i in re.split(r'[^A-Za-z]', a) if i])
Out[3]: 'Hello Goodbye hello hello bye byebye hello'

Upvotes: 1

jpp
jpp

Reputation: 164773

You are nearly there with your for loop. The main stumbling block seems to be that word[:-1] on its own does nothing, you need to store that data somewhere. For example, by appending to a list.

You also need to specify what happens to strings which don't need modifying. I'm also not sure what purpose the dictionary serves.

So here's your for loop re-written:

mystring = 'Hello, Goodbye hello hello! bye byebye hello?'
word_list = mystring.split()

res = []
for word in word_list:
    if not word.isalpha():
        res.append(word[:-1])
    else:
        res.append(word)

mystring_out = ' '.join(res)  # 'Hello Goodbye hello hello bye byebye hello'

The idiomatic way to write the above is via feeding a list comprehension to str.join:

mystring_out = ' '.join([word[:-1] if not word.isalpha() else word \
                         for word in mystring.split()])

It goes without saying that this assumes word.isalpha() returns False due to an unwanted character at the end of a string, and that this is the only scenario you want to consider for special characters.

Upvotes: 1

DavidG
DavidG

Reputation: 25370

It seems you are expecting word[:-1] to remove the last character of word and have that change reflected in the list word_list. However, you have assigned the string in word_list to a new variable called word and therefore the change won't be reflected in the list itself.

A simple fix would be to create a new list and append values into that. Note that your original string is called input which shadows the builtin input() function which is not a good idea:

input_string = 'Hello, Goodbye hello hello! bye byebye hello?'
word_list = input_string.split()
new = []
for word in word_list:
    if word.isalpha() == False:
        new.append(word[:-1])
    else:
        new.append(word)

di = dict()
for word in new:
    di[word] = di.get(word,0)+1

print(di)
# {'byebye': 1, 'bye': 1, 'Hello': 1, 'Goodbye': 1, 'hello': 3}

You could also remove the second for loop and use collections.Counter instead:

from collections import Counter
print(Counter(new))

Upvotes: 1

Related Questions