Reputation: 15
I am trying to strip all the white-spaces, commas, and apostrophes in my list, which came from a text file the user will input. I am trying to filter it to just showing numbers (with no spaces in between).
I am attempting to remove the white-spaces, commas, square brackets, and apostrophes in the variable 'file_strip', but it seems to output the same as 'file_stored_in_list'.
Anyone help me come up with a solution to filter the text file to just numbers? If there are more efficient ways of reading the text file, please let me know! Thanks!
filename = input("Input the name of the file: ")
file = open(filename, "r")
#Stores the text file into a list
file_stored_in_list = file.read().splitlines()
file.close()
#from .txt file: Outputs ['2 7 6', '9 5 1', '4 3 8']
print(file_stored_in_list)
#Attempted to remove white-spaces, tried with commas, sqaure
brackets and apostrophes, left blank for now
file_strip = [i.strip(" ") for i in file_stored_in_list]
#Outputs the same ['2 7 6', '9 5 1', '4 3 8']
print(file_strip)
Upvotes: 0
Views: 367
Reputation: 2793
A regex sub should do the trick.
import re
mylines = []
with open(myfile) as f: #better, more pythonic
mylines = f.readlines()
clean_lines = []
clean_lines = [re.sub(r"\s+", " ", l) for l in mylines]
This worked for me when i tried:
>>> import re
>>> re.sub(r"\s+", " ", "a b c")
'a b c'
Upvotes: 0
Reputation: 149185
Oops... you are trying to remove characters that do not exist in the file!
I would bet a coin that the content of the file is just:
2 7 6
9 5 1
4 3 8
But you read it with:
file = open(filename, "r")
#Stores the text file into a list
file_stored_in_list = file.read().splitlines()
file.close()
From there on, file_stored_in_list
is a nice list of nice strings. To make sure of it, just print it line by line:
for line in file_stored_in_list:
print(line)
But when you print a list, python prints square brackets ([]
) around the list, and prints the representation of the elements. And the representation of a string is that string enclosed in quotes...
BTW, the correct way of reading a file line by line is:
with open(filename) as file:
for line in file:
# process the line...
Upvotes: 2
Reputation: 4279
one way of approaching this is with a translation:
translation = str.maketrans("", "", " \t,[]'")
file_strip = [item.translate(translation) for item in file_stored_in_list]
another way is to use regular expressions:
import re
reg = re.compile(r'\D') # \D is anything other than digits
file_strip = [re.sub(reg, '', item) for item in file_stored_in_list]
It's worth noting that strip(" ")
doesn't work in the way you expected - it will only remove spaces from the beginning and end of your string. See the documentation.
Upvotes: 1