James Davinport
James Davinport

Reputation: 287

Remove Punctuation, output remaining text-Python 3 Function

I want to run this file through my function and then output the remaining text once the integers, or numbers are removed. Below is my python code:

theFile=open("home/filepath/file",'rt', encoding= 'latin-1').read()
words= the_file.split()

def replace_numbers(words):
    new_words=[]
    for word in words:
        new_word= re.sub(" \d+", " ", word)
        if new_word !='':
            new_words.append(new_word)
    return new_words

replace_numbers(words)

Here is some sample text in the file:

[email protected] 366-44-4444 Jezos was born Jeffrey Preston Jorgensen on January 12, 1964, also 5 and 4"

I want the output to be:

[email protected] 366-44-4444 Jezos was born Jeffrey Preston Jorgensen on January 12, 1964, also and

So basically removes all integers from the text file. Simple.

Is there a way to return the results of removing all the numbers in the file and then outputting whats left. As of right now, the output is just []. I know the issue is probably in the if new_word != '': section but I can't seem to find the issue.

Upvotes: 1

Views: 74

Answers (1)

tobias_k
tobias_k

Reputation: 82899

If you just want to remove the parts that are all digits, you do not even need re. Just split and then keep everything that not isdigit.

>>> text = "[email protected] 366-44-4444 Jezos was born Jeffrey Preston Jorgensen on January 12, 1964, also 5 and 4"
>>> [word for word in text.split() if not word.isdigit()]
['[email protected]', '366-44-4444', 'Jezos', 'was', 'born', 'Jeffrey', 'Preston', 'Jorgensen', 'on', 'January', '12,', '1964,', 'also', 'and']
>>> ' '.join(_)
'[email protected] 366-44-4444 Jezos was born Jeffrey Preston Jorgensen on January 12, 1964, also and'

Upvotes: 2

Related Questions