Reputation: 25
I have to make a code that will take a input(text) and strip it of all punctuations and make it all lowercase. I wrote the code I knew and it doesn't seem to give the outcome I wanted. To start I made a simple lower function. althought it doesn't seem to work. for the stripping of all punctuation I made a list of all possible punctuation marks and created a variable that would constantly update to the next mark. then run it through a split function. I also use a main function to call all my functions once I finished. I dont know if this is the cause of my issue. or if this would be easier if i did it in a class. Any input?
import string
punctuations = [".", ",", "?", ";", "!", ":", "'", "(", ")", "[", "]", "\"", "...", "-", "~", "/", "@", "{", "}", "*"]
text= str(input("Enter a text: "))
text_Lower=text.lower()
def remove_punctuation(self):
for i in punctuations:
str2=punctuations[i]
self.split(str2= "")
print(self)
#def remove_cword():
#def fequent_word():
#def positive_word():
def __main__():
print("Here is your text in lower case: \n")
print(text_Lower)
text_Punct=remove_punctuation(text_Lower)
print(text_Punct)
Upvotes: 1
Views: 1912
Reputation: 414
You can swap this question around a bit; instead of asking which characters to I want to remove, you can ask which characters do I want to keep. It seems you want to keep everything that is either a letter, digit or whitespace, and you can do that quite simple with regex using the re library.
import re
def remove_non_alphanumeric(s):
return re.sub(r'[^a-zA-Z0-9]\s', '', s)
def test_remove_non_alphanumeric():
assert remove_non_alphanumeric('Hello, World! 123') == 'Hello World 123'
assert remove_non_alphanumeric('abcd1234') == 'abcd1234'
assert remove_non_alphanumeric('!!!') == ''
assert remove_non_alphanumeric('a b c d') == 'a b c d'
assert remove_non_alphanumeric('1 2 3 4') == '1 2 3 4'
assert remove_non_alphanumeric('a@b#c$d%') == 'abcd'
assert remove_non_alphanumeric('1!2@3#4$') == '1234'
assert remove_non_alphanumeric('a\nb\tc\rd') == 'a\nb\tc\rd'
test_remove_non_alphanumeric()
Upvotes: 2
Reputation: 134
punctuations = [".", ",", "?", ";", "!", ":", "'", "(", ")", "[", "]", "\"", "...", "-", "~", "/", "@", "{", "}", "*"]
def remove_punctuation(text_Lower):
for i in punctuations:
text_Lower = text_Lower.replace(i, "")
return text_Lower
def main():
text = str(input("Enter a text: "))
text_Lower = text.lower()
text_Punct = remove_punctuation(text_Lower)
print("Here is your text in lower case:")
print(text_Lower)
print("Here is your text in lower case without punctuation:")
print(text_Punct)
main()
Upvotes: 0