Reputation: 31
I am trying to do a task where the programme goes through a directory, opens each file by turn, and checks a specific line before anything else. If the line meets a specific criteria (namely, that it does not match this line in any other file in the directory), the file closes and the programme moves onto the next file.
aps = []
import os
for filename in os.listdir("C:\..."):
f = open(filename,"r")
(f.readline())
(f.readline())
ap = (f.readline())
ap = ap.rstrip("\n")
aps.append(ap)
freqs = {}
for ap in aps:
freqs[ap] = freqs.get(ap, 0) + 1
for k, v in freqs.items():
if v == 2:
f.close()
else:
For the 'else:', I originally tried 'f.seek(0)', but got the error of Python being unable to work with a closed file. I then tried 'f = open(filename, "r")' again, but this is doing something odd, as when I try to print the first line through this method it sends it on a crazy loop and prints the line multiple times.
Is this the best way to go about this task? And if not, how could I get it to work?
Many thanks.
Upvotes: 0
Views: 653
Reputation: 124
Here is why your code fails. You initialize the aps
list outside of your outer for loop, so it will contain the specified line from all files that you loop over. Then your freqs
dictionary is reset to empty for each file that you open.
So these lines:
for ap in aps:
freqs[ap] = freqs.get(ap, 0) + 1
loop over each line that has been read so far, and count the frequency. The problem comes in the inner for loop:
for k, v in freqs.items():
if v == 2:
f.close()
What happens here is that freqs
has a set of keys potentially as large as the number of files you have looped over so far, and you are looping through each key. So the first time a key has a value of 2, the current file is closed. But then the loop continues, so the next time a key has a value of 2, python tries to close the file, but it is already closed.
The easiest fix is to add a break
after the f.close()
. But there are better ways to structure this code.
One is to always open a file using a with
command, unless you have a good reason to do otherwise. So:
with open(filename,"r") as f:
#code
That way the file will close automatically when you are done with it.
I am assuming that the order you are looping through the files isn't important, and that you want the frequency test to include all the files, not just the ones that have been opened so far. In that case it may be easier to loop through twice, once for assembling your frequency dict, and a second time for doing whatever you want to do to the files that meet frequency requirements.
aps = []
freqs = {}
# First loop to read the important line from all files
for filename in os.listdir("C:\..."):
with open(filename,"r") as f:
f.readline()
f.readline()
ap = f.readline().rstrip("\n")
aps.append(ap)
# Populate the dictionary
for ap in aps:
freqs[ap] = freqs.get(ap, 0) + 1
# Second loop to handle the important cases
for filename in os.listdir("C:\..."):
with open(filename,"r") as f:
f.readline()
f.readline()
ap = f.readline().rstrip("\n")
if freqs[ap] != 2:
#do whatever
I strongly suspect there are more efficient and pythonic ways of getting there, but this is my best thought.
Upvotes: 1
Reputation: 49310
Don't close the file conditionally. Do what you need to do with the open file, and then close it at the end. With a with
construct the file will close automatically:
for filename in os.listdir(path):
with open(filename) as f:
# do processing here
if positive_condition:
# do more processing
Upvotes: 2