Reputation: 728
I am learning to code in Python. Now I am experimenting with a file comparison program from here.
My code is:
#!/usr/bin/python3
def main():
fhand1 = open('mbox.txt')
print('file handle for mbox is {}'.format(fhand1))
count = 0
for l1 in fhand1:
count = count + 1
l1 = l1.rstrip() # Skip 'uninteresting lines'
if l1.startswith('From:'):
print('{}'.format(l1))
print('Numer of lines: {}'.format(count))
fhand2 = open('mbox-short.txt')
#inp = fhand2.read(), when here for loop does not work
#for l2 in fhand2:
#if l2.startswith('From:'):
#print('{}'.format(l2))
inp = fhand2.read()#if for loop is active then this doesnot work
print('Total characters in mbox-short: {}'.format(len(inp)))
print('First 20 characters on mbox-short: {}'.format(inp[:56]))
if __name__ == "__main__": main()
My question is for 'mbox-short.txt'. When I put inp = fhand2.read()
before the for l2 in fhand2: {}
the for loop does not run. When I change the sequence, the read()
operation does not work.
Can someone please explain this?
Btw, I am using JetBrains PyCharm Community Ed 4 IDE.
Thank you in advance.
Upvotes: 2
Views: 896
Reputation: 1364
By calling .read() on a file object you empty it and therefore cant loop over its elements anymore. You can test this by calling read with the optional [size] argument. The size of mbox-short.txt is 94626. Calling read with 94625 reads the first 94625 bytes of your file into a string. You can than loop over the remaining 1 byte in the file object (which is the newline character \n). file.read([size]) reads the whole file into a string by default and therefore nothing to iterate over remains.
filehandle = open("mbox-short.txt")
fileString = filehandle.read(94625)
print (len(fileString))
count = 0
for x in filehandle:
print (x)
count += 1
print (count)
See: https://docs.python.org/2/library/stdtypes.html?highlight=read#file.read
(I can't find file.read() in python3 documentation, but I assume it hasn't changed over the versions)
Upvotes: 0
Reputation: 1657
What is happening here is the read operation returning the full contents of the file (thus placing the caret at the end of the file) by the time when you assign your variable, that is why you are receiving empty string.
You need either do this:
fhand2 = open('mbox-short.txt')
inp = fhand2.read() # uncomment the first read operation
for l2 in fhand2:
if l2.startswith('From:'):
print('{}'.format(l2))
# inp = fhand2.read() comment out the second one
or this:
fhand2 = open('mbox-short.txt')
inp = fhand2.read()
for l2 in fhand2:
if l2.startswith('From:'):
print('{}'.format(l2))
fhand2 = open('mbox-short.txt') # re-open the file you have already read
inp = fhand2.read()
See more information on the python i/o here.
Upvotes: 1
Reputation: 187
inp = fhand2.readlines() should fix your problem. FYI check this out How do I read a file line-by-line into a list?
Upvotes: 0
Reputation: 3160
The read()
method will read the full file into a string.
So if say your file looks like
1 2 3 4
5 6 7 8
This will return "1 2 3 4\n5 6 7 8\n"
. So when you say, for l2 in fhand2
, it will loop across this string. Thus you are basically going through each and every element in the string. i.e 1
, ,
2
and so on.
If you want to read line by line, either use readline()
which will fetch you the next line, or use readlines()
which will fetch you a list like - ["1 2 3 4\n", "5 6 7 8\n"]
Upvotes: 0