Petite Foufoune
Petite Foufoune

Reputation: 25

Remove punctuation and lowercase string

I have to make a code that will take a input(text) and strip it of all punctuations and make it all lowercase. I wrote the code I knew and it doesn't seem to give the outcome I wanted. To start I made a simple lower function. althought it doesn't seem to work. for the stripping of all punctuation I made a list of all possible punctuation marks and created a variable that would constantly update to the next mark. then run it through a split function. I also use a main function to call all my functions once I finished. I dont know if this is the cause of my issue. or if this would be easier if i did it in a class. Any input?

import string
punctuations = [".", ",", "?", ";", "!", ":", "'", "(", ")", "[", "]", "\"", "...", "-", "~", "/", "@", "{", "}", "*"]
text= str(input("Enter a text: "))
text_Lower=text.lower()
def remove_punctuation(self):
    for i in punctuations:
        str2=punctuations[i]
        self.split(str2= "")
    print(self)

#def remove_cword():
#def fequent_word():
#def positive_word():





def __main__():
    print("Here is your text in lower case: \n")
    print(text_Lower)
    text_Punct=remove_punctuation(text_Lower)
    print(text_Punct)

Upvotes: 1

Views: 1912

Answers (2)

Henrik Klev
Henrik Klev

Reputation: 414

You can swap this question around a bit; instead of asking which characters to I want to remove, you can ask which characters do I want to keep. It seems you want to keep everything that is either a letter, digit or whitespace, and you can do that quite simple with regex using the re library.

import re

def remove_non_alphanumeric(s):
    return re.sub(r'[^a-zA-Z0-9]\s', '', s)

def test_remove_non_alphanumeric():
    assert remove_non_alphanumeric('Hello, World! 123') == 'Hello World 123'
    assert remove_non_alphanumeric('abcd1234') == 'abcd1234'
    assert remove_non_alphanumeric('!!!') == ''
    assert remove_non_alphanumeric('a b c d') == 'a b c d'
    assert remove_non_alphanumeric('1 2 3 4') == '1 2 3 4'
    assert remove_non_alphanumeric('a@b#c$d%') == 'abcd'
    assert remove_non_alphanumeric('1!2@3#4$') == '1234'
    assert remove_non_alphanumeric('a\nb\tc\rd') == 'a\nb\tc\rd'

test_remove_non_alphanumeric()

Upvotes: 2

Marcel Suleiman
Marcel Suleiman

Reputation: 134

punctuations = [".", ",", "?", ";", "!", ":", "'", "(", ")", "[", "]", "\"", "...", "-", "~", "/", "@", "{", "}", "*"]

def remove_punctuation(text_Lower):
    for i in punctuations:
        text_Lower = text_Lower.replace(i, "")

    return text_Lower
    

def main():
    text = str(input("Enter a text: "))

    text_Lower = text.lower()

    text_Punct = remove_punctuation(text_Lower)

    print("Here is your text in lower case:")
    print(text_Lower)

    print("Here is your text in lower case without punctuation:")
    print(text_Punct)


main()

Upvotes: 0

Related Questions