Reputation: 103
I'm a beginner. I have written a Python program with the following pseduocode:
a. This function takes a large single-fasta file (a genome) and splits it into pieces.
b. These pieces are written to a multi-fasta output file (ex. below).
a. This function reads the lines of the multi-fasta file
b. Writes to an output file the fasta id followed by the length of the fasta entry.
most of the code:
from Bio import SeqIO
import io
def metagenome_simulator(genome_fasta, out_file):
outfile = open(out_file, "a+b")
fasta = SeqIO.parse(open(genome_fasta, "rU"), "fasta")
#does the split, blah, blah - I know this function works on its own already
len_file.close()
fasta.close()
return outfile
def contig_len_calculator(fasta, out_file):
outfile = io.open(out_file, "wb")
fhandle = io.open(fasta, "a+b")
outfile.write("contig_id" + "\t" + "contig_length" + "\n")
for entry in SeqIO.parse(fhandle, "fasta"):
#calculates lengths, blah, blah - i know this works independently too
outfile.close()
fhandle.close()
return
def main():
output = metagenome_simulator(sys.argv[1], sys.argv[2])
print(output)
contig_len_calculator(output, sys.argv[3])
main()
And my command (bash shell) would be:
./this_script.py genome_fasta_file split_fasta_out_file final_output_file.
The output would be two separate files, one for each function in the program. The first would be the split fasta:
>split_1
ATCG....
>split_2
ATCG....
.
.
.
And the second would be the lengths file:
>split_1 300
>split_2 550
.
.
.
This does not work. It runs Fuction1 just fine and makes the split_fasta_output file but then returns:
<open file 'out_file', mode 'a+b' at 0x7f54b8454d20>
Traceback (most recent call last):
File "./this_script.py", line 62, in <module>
main()
File "./this_script.py", line 60, in main
contig_len_calculator(output, sys.argv[3])
File "./this_script.py", line 47, in contig_len_calculator
fhandle = io.open(fasta, "a+b")
TypeError: invalid file: <open file 'out_file', mode 'a+b' at 0x7f54b8454d20>
I have no idea why it doesn't work. So my question is this: how do I properly pass a file created in one function to another function?
EDIT: Put the whole traceback error.
Upvotes: 2
Views: 152
Reputation: 54213
The problem is that metagenome_simulator
returns a file descriptor, which you then try to pass into io.open
. io.open
takes either an integer file descriptor (some_fd.fileno()
) or a path. The simple solution is then to return the path to your outfile, rather than the outfile itself.
def metagenome_simulator(genome_fasta, out_file):
... # your code as-written
return out_file
But if you like you could instead do:
def metagenome_simulator(genome_fasta, out_file):
# completely as-written, including
return outfile
def contig_len_calculator(fasta, out_file):
outfile = io.open(out_file, "wb")
fhandle = io.open(fasta.fileno(), "a+b")
...
The advantage of the first approach is that it makes the out_file
and fasta
arguments to contig_len_calculator
have the same type, which seems sane.
Upvotes: 2
Reputation: 256
The open
function takes a filename and returns a file object. metagenome_simulator
returns a file object. You pass this as fasta
and then use open
on it. But you do not need to open it since it's already an open file and not just a filename.
Upvotes: 0