Reputation: 1846
I have a text file of names, all of which have three spaces at the end of them, which I would like to remove. When I print these names in python, I get output like the follows:
Adeline Panella Â
Winifred Aceto Â
See Weckerly Â
Daniell Hildebrand Â
Betsey Coulter Â
#there are about 1000 of these names
To remove the extra spaces, I wrote the following script:
import os
script_directory = os.path.dirname(__file__)
file = open(os.path.join(script_directory, "assets/data/names.txt"), 'r')
potential_names = file.read().splitlines()
potential_names = list(filter(None, potential_names))
for item in potential_names:
print(item)
item = item[:-3]
print(item)
file.close()
file = open(os.path.join(script_directory, "assets/data/names.txt"), 'w')
for item in potential_names:
file.write("{}\n".format(item))
file.close()
It appears to function as expected, as the output is as follows:
Adeline Panella Â
Adeline Panella
Winifred Aceto Â
Winifred Aceto
See Weckerly Â
See Weckerly
Daniell Hildebrand Â
Daniell Hildebrand
Betsey Coulter Â
Betsey Coulter
HOWEVER: When I run the script a second time, the output is exactly the same, and when I examine the text file, the three spaces at the end remain there. How can I permanently remove this extra spacing?
Upvotes: 0
Views: 100
Reputation: 881153
for item in potential_names:
print(item)
item = item[:-3]
print(item)
When you change item
on that third line above, it does not reflect back to the potential_names
collection, it simply changes item
. That's why it appears to be modifying the string(1).
However, later, when you process the collection:
for item in potential_names:
that's the original contents of the collection you're outputting.
One way to get around this is to simply construct a new list with the final three characters removed from each item:
potential_names = [x[:-3] for x in potential_names]
(1) Python is generally considered a pure object-oriented language because everything is an object to which names refer.
That has certain limitations in that the expression item = '12345'; item = item[:-3]
doesn't change the value of the underlying '12345'
string, it creates a new string and changes the value of the item
reference to refer to it.
That aspect of the language was a real eye-opener once I figured out how it works.
Upvotes: 4