Reputation: 13
I need help with trying to remove all the whitespace from the non-newline lines in my output in Python 3. I want to do this so I can convert my string into a list. Currently when I run my program it outputs this:
Belgian Waffles
$5.95
Two of our famous Belgian Waffles with plenty of real maple syrup
650
Strawberry Belgian Waffles
(extra output)
Homestyle Breakfast
$6.95
Two eggs, bacon or sausage, toast, and our ever-popular hash browns
950
(end of output)
The result I'm trying to get is this:
Belgian Waffles
$5.95
Two of our famous Belgian Waffles with plenty of real maple syrup
650
Strawberry Belgian Waffles
(extra output)
Homestyle Breakfast
$6.95
Two eggs, bacon or sausage, toast, and our ever-popular hash browns
950
(end of output)
This is the current code that I have right now:
import os
import re
def get_filename():
print("Enter the name of the file: ")
filename = input()
return filename
def read_file(filename):
if os.path.exists(filename):
with open(filename, "r") as file:
full_text = file.read()
return full_text
else:
print("This file does not exist")
def get_tags(full_text):
tags = re.findall('<.*?>', full_text)
for tag in tags:
full_text = full_text.replace(tag, '')
return tags
def get_text(text):
tags = re.findall('<.*?>', text)
for tag in tags:
text = text.replace(tag, '')
text = text.strip()
return text
def display_output(text):
print(text)
def main():
filename = get_filename()
full_text = read_file(filename)
tags = get_tags(full_text)
text = get_text(full_text)
display_output(text)
main()
Any help or suggestions would be appreciated.
Upvotes: 1
Views: 80
Reputation: 21
You can either use REGEX experession or use this function below which iterates over every character and checks it's ascii value. If those ascii value are mentioned in the list then it appends it to the final string of characters. You can find the list of ascii values here and increase the list size as per your requirement.
def removeGarbageCharacter(thisString):
thisString = "hello world"
FinalString = ""
ASCII_of_Other_char = [34,39,44,32] # Ascii of space, comma, semi-colon etc
for thisChar in thisString:
asciiVal = ord(thisChar)
if (asciiVal >=65 and asciiVal <=90) or (asciiVal >=97 and asciiVal <=122) or (asciiVal >=48 and asciiVal <=57)or asciiVal in ASCII_of_Other_char: #Ascii of A-Z, a-z, 0-9
FinalString += thisChar
return(FinalString)
Upvotes: 1