Reputation: 51
I have the following function. The program looks at each file and prints the lines occuring in all 4 files to a new file. I've tried file1.close()
but I get an error about closing a set? I think I could use a with
statement but not sure how to do this, I'm very new to programming.
def secretome():
file1 = set(line.strip() for line in open(path + "goodlistSigP.txt"))
file2 = set(line.strip() for line in open(path + "tmhmmGoodlist.txt"))
file3 = set(line.strip() for line in open(path + "targetpGoodlist.txt"))
file4 = set(line.strip() for line in open(path + "wolfPsortGoodlist.txt"))
newfile = open(path + "secretome_pass.txt", "w")
for line in file1 & file2 & file3 & file4:
if line:
newfile.write(line + '\n')
newfile.close()
Upvotes: 0
Views: 107
Reputation: 43
This seems to be a very complicated way of doing it. I'd suggest something like the example I've given here.
import fileinput
files = ['file1.txt','file2.txt','file3.txt','file4.txt']
output = open('output.txt','w')
for file in files:
for line in fileinput.input([file]):
output.write(line)
output.write('\n')
output.close()
This code creates a list with the files in it (replace the names with the required filepaths), creates a file to store the output of each and then simply iterates through them using the fileinput module to go through each one line by line, printing each line to the output file as it goes. The 'output.write('\n')' ensures that the printing of the next file's lines starts on a new line in the output file.
Upvotes: 1
Reputation: 310117
To take this a completely different direction than my original (which Lattyware beat me to):
You could define a function:
def file_lines(fname):
with open(fname) as f:
for line in f:
yield line
Now you can use itertools.chain
to iterate over your files:
import itertools
def set_from_file(path):
filenames = ("name1","name2","name3",...) #your input files go here
lines = itertools.chain.from_iterable(itertools.imap(file_lines,filenames))
#lines is an iterable object.
#At this point, virtually none of your system's resources have been consumed
with open("output",'w') as fout:
#Now we only need enough memory to store the non-duplicate lines :)
fout.writelines(set( line.strip()+'\n' for line in lines) )
Upvotes: 1
Reputation: 89087
I would suggest removing the repetition by extracting your set generation into a function:
def set_from_file(path):
with open(path) as file:
return set(lines.strip() for line in file)
def secretome():
files = ["goodlistSigP.txt", "tmhmmGoodlist.txt", "targetpGoodlist.txt", "wolfPsortGoodlist.txt"]
data = [set_from_file(os.path.join(path, file)) for file in files]
with open(path + "secretome_pass.text", "w") as newfile:
newfile.writelines(line + "/n" for line in set.union(*data) if line)
Note that you are doing intersection in your code, but you talk about wanting a union, so I used union()
here. There are also a couple of list comprehensions/generator expressions.
Upvotes: 4
Reputation: 91119
You could put it into a generator:
def closingfilelines(*a):
with open(*a) as f:
for line in f:
yield f
and use it where you currently use open()
.
While the generator runs, the file is kept open, and if the generator is exhausted, it gets closed.
The same happens if the generator object is .close()
d or dropped - in this case, the generator gets a GeneratorExit
exception, which makes the with
clause be left as well.
Upvotes: 1