Altheman
Altheman

Reputation: 83

convert a fasta file to a tab-delimited file using python script

I am student currently learning how to write scripts in python. I have been giving the following exercise. I have to convert a fasta file in the following format:

>header 1 
AATCTGTGTGATAT 
ATATA  
AT
>header 2  
AATCCTCT

into this:

>header 1  AATCTGTGTGATATATATAAT
>header 2  AATCCTCT 

I am having some difficulty getting rid of the white space (using line.strip()?) Any help would be very much appreciated...

Upvotes: 2

Views: 1487

Answers (1)

James
James

Reputation: 36623

This creates a new string based on the > character and combines the string until the next >. It then appends to a running list.

# open file and iterate through the lines, composing each single line as we go
out_lines = []
temp_line = ''
with open('path/to/file','r') as fp:
     for line in fp:
         if line.startswith('>'):
             out_lines.append(temp_line)
             temp_line = line.strip() + '\t'
         else:
             temp_line += line.strip()

with open('path/to/new_file', 'w') as fp_out:
    fp_out.write('\n'.join(out_lines))

Upvotes: 2

Related Questions