Reputation: 33
I have a text file named number.txt. It contains the following:
0
1
2
3
My code:
def main():
inFile = open("number.txt", "r")
text = inFile.read()
inFile.close()
print(len(text))
main()
I have tried to use the above code to print out how many characters are in the file. It prints out 8, but there are only 4 characters. I know that when python reads in the file it adds a newline after each line, and this could be extra characters. How do I get rid of this?
Upvotes: 1
Views: 7024
Reputation: 898
Use string.rstrip('\n')
. This will remove newlines from the right side of the string, and nothing else. Note that python should convert all newline chars to \n
, regardless of platform. I would also recommend iterating over the lines of the file, rather than dumping it all to memory, in case you have a large file.
Example code:
if __name__ == '__main__':
count = 0
with open("number.txt", "r") as fin):
for line in fin:
text = line.rstrip('\n')
count += len(text)
print(count)
Upvotes: 0
Reputation: 4674
The file contains a newline between each line. To filter it out, you can either recreate the string without those newlines with replace
, split
, or similar, or count the newlines and subtract them from the length (which is faster/more efficient).
with open("number.txt", "r") as file:
text = file.read()
length_without_newlines = len(text) - text.count('\n')
Edit: As @lvc says, Python converts all line endings to '\n' (0x0A), including windows newlines ('\r\n' or [0x0D, 0x0A]), so one need only search for '\n' when finding new line characters.
Upvotes: 4
Reputation: 606
Try this:
if __name__ == '__main__':
with open('number.txt', 'rb') as in_file:
print abs(len(in_file.readlines()) - in_file.tell())
Upvotes: 0
Reputation: 2327
The answer of your script is correct: in fact new line are character too (they only are invisible!)
To omit the new line characters (referred in strings with \n
or \r\n
) then you have to substitute them with an empty string.
See this code:
def main():
inFile = open("number.txt", "r")
text = inFile.read()
text = text.replace("\r\n","") #in windows, new lines are usually these two
text = text.replace("\n","")
caracters. inFile.close() print(len(text)) main()
for more information about what \r\n
and \n
are, try: http://en.wikipedia.org/wiki/Newline
Upvotes: 0
Reputation: 2676
As Antonio said in the comment the newline characters are in the file. if you want, you can remove them:
def main():
inFile = open("number.txt", "r")
text = inFile.read()
inFile.close()
text = text.replace('\n', '') # Replace new lines with nothing (empty string).
print(len(text))
main()
Upvotes: 1