novice coder
novice coder

Reputation: 33

python: file i/o counting characters without new lines

I have a text file named number.txt. It contains the following:

0
1
2
3

My code:

def main():
   inFile = open("number.txt", "r")
   text = inFile.read() 
   inFile.close()
   print(len(text))
main()

I have tried to use the above code to print out how many characters are in the file. It prints out 8, but there are only 4 characters. I know that when python reads in the file it adds a newline after each line, and this could be extra characters. How do I get rid of this?

Upvotes: 1

Views: 7024

Answers (6)

Haydn
Haydn

Reputation: 1

Do it in the print line, like this:

    print(len(text.replace("\n", "")))

Upvotes: 0

groundlar
groundlar

Reputation: 898

Use string.rstrip('\n'). This will remove newlines from the right side of the string, and nothing else. Note that python should convert all newline chars to \n, regardless of platform. I would also recommend iterating over the lines of the file, rather than dumping it all to memory, in case you have a large file.

Example code:

if __name__ == '__main__':
   count = 0
   with open("number.txt", "r") as fin):
       for line in fin:
           text = line.rstrip('\n')
           count += len(text)
   print(count)

Upvotes: 0

Pi Marillion
Pi Marillion

Reputation: 4674

The file contains a newline between each line. To filter it out, you can either recreate the string without those newlines with replace, split, or similar, or count the newlines and subtract them from the length (which is faster/more efficient).

with open("number.txt", "r") as file:
    text = file.read()
length_without_newlines = len(text) - text.count('\n')

Edit: As @lvc says, Python converts all line endings to '\n' (0x0A), including windows newlines ('\r\n' or [0x0D, 0x0A]), so one need only search for '\n' when finding new line characters.

Upvotes: 4

var211
var211

Reputation: 606

Try this:

if __name__ == '__main__':
    with open('number.txt', 'rb') as in_file:
        print abs(len(in_file.readlines()) - in_file.tell())

Upvotes: 0

Antonio Ragagnin
Antonio Ragagnin

Reputation: 2327

The answer of your script is correct: in fact new line are character too (they only are invisible!)

To omit the new line characters (referred in strings with \n or \r\n) then you have to substitute them with an empty string.

See this code:

def main():
   inFile = open("number.txt", "r")
   text = inFile.read()
   text = text.replace("\r\n","") #in windows, new lines are usually these two 
   text = text.replace("\n","")   

caracters. inFile.close() print(len(text)) main()

for more information about what \r\n and \n are, try: http://en.wikipedia.org/wiki/Newline

Upvotes: 0

Nicolas Defranoux
Nicolas Defranoux

Reputation: 2676

As Antonio said in the comment the newline characters are in the file. if you want, you can remove them:

def main():
   inFile = open("number.txt", "r")
   text = inFile.read() 
   inFile.close()
   text = text.replace('\n', '')  # Replace new lines with nothing (empty string).
   print(len(text))
main()

Upvotes: 1

Related Questions