Reputation: 83
I am student currently learning how to write scripts in python. I have been giving the following exercise. I have to convert a fasta file in the following format:
>header 1
AATCTGTGTGATAT
ATATA
AT
>header 2
AATCCTCT
into this:
>header 1 AATCTGTGTGATATATATAAT
>header 2 AATCCTCT
I am having some difficulty getting rid of the white space (using line.strip()?) Any help would be very much appreciated...
Upvotes: 2
Views: 1487
Reputation: 36623
This creates a new string based on the >
character and combines the string until the next >
. It then appends to a running list.
# open file and iterate through the lines, composing each single line as we go
out_lines = []
temp_line = ''
with open('path/to/file','r') as fp:
for line in fp:
if line.startswith('>'):
out_lines.append(temp_line)
temp_line = line.strip() + '\t'
else:
temp_line += line.strip()
with open('path/to/new_file', 'w') as fp_out:
fp_out.write('\n'.join(out_lines))
Upvotes: 2