Brandon Kieft
Brandon Kieft

Reputation: 103

Python, How can I correctly pass the result of one function (specifically, a file) to another function?

I'm a beginner. I have written a Python program with the following pseduocode:

  1. Define Function1.

a. This function takes a large single-fasta file (a genome) and splits it into pieces.

b. These pieces are written to a multi-fasta output file (ex. below).

  1. Define Function2.

a. This function reads the lines of the multi-fasta file

b. Writes to an output file the fasta id followed by the length of the fasta entry.

most of the code:

from Bio import SeqIO
import io

def metagenome_simulator(genome_fasta, out_file):
    outfile = open(out_file, "a+b")
    fasta = SeqIO.parse(open(genome_fasta, "rU"), "fasta")
         #does the split, blah, blah - I know this function works on its own already
    len_file.close()
    fasta.close()
    return outfile

def contig_len_calculator(fasta, out_file):
    outfile = io.open(out_file, "wb")
    fhandle = io.open(fasta, "a+b")
    outfile.write("contig_id" + "\t" + "contig_length" + "\n")
    for entry in SeqIO.parse(fhandle, "fasta"):
        #calculates lengths, blah, blah - i know this works independently too
     outfile.close()
     fhandle.close()
     return

def main():
    output = metagenome_simulator(sys.argv[1], sys.argv[2])
    print(output)
    contig_len_calculator(output, sys.argv[3])

 main()

And my command (bash shell) would be:

./this_script.py genome_fasta_file split_fasta_out_file final_output_file.

The output would be two separate files, one for each function in the program. The first would be the split fasta:

>split_1
ATCG....
>split_2
ATCG....
.
.
.

And the second would be the lengths file:

>split_1    300
>split_2    550
.
.
.

This does not work. It runs Fuction1 just fine and makes the split_fasta_output file but then returns:

<open file 'out_file', mode 'a+b' at 0x7f54b8454d20>
Traceback (most recent call last):
File "./this_script.py", line 62, in <module>
main()
File "./this_script.py", line 60, in main
contig_len_calculator(output, sys.argv[3])
File "./this_script.py", line 47, in contig_len_calculator
fhandle = io.open(fasta, "a+b")
TypeError: invalid file: <open file 'out_file', mode 'a+b' at 0x7f54b8454d20>

I have no idea why it doesn't work. So my question is this: how do I properly pass a file created in one function to another function?

EDIT: Put the whole traceback error.

Upvotes: 2

Views: 152

Answers (2)

Adam Smith
Adam Smith

Reputation: 54213

The problem is that metagenome_simulator returns a file descriptor, which you then try to pass into io.open. io.open takes either an integer file descriptor (some_fd.fileno()) or a path. The simple solution is then to return the path to your outfile, rather than the outfile itself.

def metagenome_simulator(genome_fasta, out_file):
    ...  # your code as-written
    return out_file

But if you like you could instead do:

def metagenome_simulator(genome_fasta, out_file):
    # completely as-written, including
    return outfile

def contig_len_calculator(fasta, out_file):
    outfile = io.open(out_file, "wb")
    fhandle = io.open(fasta.fileno(), "a+b")
    ...

The advantage of the first approach is that it makes the out_file and fasta arguments to contig_len_calculator have the same type, which seems sane.

Upvotes: 2

Ariakenom
Ariakenom

Reputation: 256

The open function takes a filename and returns a file object. metagenome_simulator returns a file object. You pass this as fasta and then use open on it. But you do not need to open it since it's already an open file and not just a filename.

Upvotes: 0

Related Questions