Sort the order of fasta sequence using python

Question

I have a fasta file (consists of >header and sequence lines) as below:

myfasta

>S.sclerotiorum_Ch16_153_209
AACCCTAACCCTAACCCTTGATTGATTGATTGATTGATTGAT
TGATTGATGAAATTATAGTCTCCGTAAAGCAAATAAAGCATT
TAGTAAACGTTGAAGAGCTAGAAAAGCTTTAATACAAAAAGG
>S.sclerotiorum_Ch16_153_209
AACCCTAACCCTAACCCTTGATTGATTGATTGATTGATTGAT
TAGTAAACGTTGAAGAGCTAGAAAAGCTTTAATACAAAAAGG
>S.sclerotiorum_Ch14_442_1137
TGTCAATTCGATCTAGTATT
>S.sclerotiorum_Ch12_1831_180
AGAGCTAGAAAAGCTTTAAT
>S.sclerotiorum_Ch1_1831_180
AGAGCTAGAAAAGCTTTAATAGAGCTAGAAAAGCTTTAAT
AGAGCTAGAAAAGCTTTAATAGAGCTAGAAAAGCTTTAAT

I want to print this file in a defined order (starting with Ch1 to Ch16) and get the result as below:

>S.sclerotiorum_Ch1_1831_180
AGAGCTAGAAAAGCTTTAATAGAGCTAGAAAAGCTTTAAT
AGAGCTAGAAAAGCTTTAATAGAGCTAGAAAAGCTTTAAT
>S.sclerotiorum_Ch12_1831_180
AGAGCTAGAAAAGCTTTAAT
>S.sclerotiorum_Ch14_442_1137
TGTCAATTCGATCTAGTATT
>S.sclerotiorum_Ch16_153_209
AACCCTAACCCTAACCCTTGATTGATTGATTGATTGATTGAT
TAGTAAACGTTGAAGAGCTAGAAAAGCTTTAATACAAAAAGG
>S.sclerotiorum_Ch16_153_209
AACCCTAACCCTAACCCTTGATTGATTGATTGATTGATTGAT
TGATTGATGAAATTATAGTCTCCGTAAAGCAAATAAAGCATT
TAGTAAACGTTGAAGAGCTAGAAAAGCTTTAATACAAAAAGG

I tried to write this code, but I am still getting the same order as my input file in my result.fasta. Any help would be appreciated in fixing my code. Thanks!

Code: python code.py myfasta.fasta >> result.fasta

#!/usr/bin/env python
import sys
import os
import pathlib

myfasta = sys.argv[1]
fasta = open(myfasta)

#types = ['S.sclerotiorum_Ch16_', 'S.sclerotiorum_Ch15_', 'S.sclerotiorum_Ch14_', 'S.sclerotiorum_Ch13_', 'S.sclerotiorum_Ch12_', 'S.sclerotiorum_Ch11_', 'S.sclerotiorum_Ch10_', 'S.sclerotiorum_Ch9_', 'S.sclerotiorum_Ch8_', 'S.sclerotiorum_Ch7_', 'S.sclerotiorum_Ch6_', 'S.sclerotiorum_Ch5_', 'S.sclerotiorum_Ch4_', 'S.sclerotiorum_Ch3_', 'S.sclerotiorum_Ch2_', 'S.sclerotiorum_Ch1_']

types = ['S.sclerotiorum_Ch1', 'S.sclerotiorum_Ch2', 'S.sclerotiorum_Ch3', 'S.sclerotiorum_Ch4', 'S.sclerotiorum_Ch5', 'S.sclerotiorum_Ch6', 'S.sclerotiorum_Ch7', 'S.sclerotiorum_Ch8', 'S.sclerotiorum_Ch9', 'S.sclerotiorum_Ch10', 'S.sclerotiorum_Ch11', 'S.sclerotiorum_Ch12', 'S.sclerotiorum_Ch13', 'S.sclerotiorum_Ch14', 'S.sclerotiorum_Ch15', 'S.sclerotiorum_Ch16']




for type in range(len(types)):
    flag = False
    fasta = open(myfasta)
    for line in fasta:
        if line.startswith('>') and types[type] in line:
            flag = True
        elif line.startswith('>'):
          flag = False
        if flag:
            #grabbed = line.strip()
            #newfasta.writelines(grabbed + "
")
            print(line.strip())

fasta.close

Sort the order of fasta sequence using python

Answers (1)

Related Questions