Reputation: 482
This code
for record in SeqIO.parse(open(file, 'rU'), 'fasta', generic_protein):
record_id = re.sub(r'\d+_(\d+_\d\#\d+)_\d+', r'\1', record.id)
if ( not reference_sequence ):
reference_sequence = record.seq
reference_name = record_id
#continue
print ",".join([reference_name, record_id, compare_seqs(reference_sequence, record.seq)])
gives terminal output that looks like
7065_8#1,8987_2#53,
7065_8#1,8987_2#58,
7065_8#1,8987_2#61,
7065_8#1,8987_2#62,E-G [246]
7065_8#1,8987_2#65,N-K [71],Y-D [223]
I want to write this line by line to a CSV, any suggestions?
Upvotes: 0
Views: 79
Reputation: 518
You can also write the comma separated string (along with quotechar) directly to the file:
f = open("output.csv","w")
for record in SeqIO.parse(open(file, 'rU'), 'fasta', generic_protein):
record_id = re.sub(r'\d+_(\d+_\d\#\d+)_\d+', r'\1', record.id)
if ( not reference_sequence ):
reference_sequence = record.seq
reference_name = record_id
#continue
csvrow = '","'.join([reference_name, record_id, compare_seqs(reference_sequence, record.seq)])
csvrow = '"'+csvrow+'"'
print >>f, csvrow
f.close()
Using this method, you can open the file and check if the data is being written, even when the script is running.
Upvotes: 1
Reputation: 36239
Pack all records in a nested list (i.e. instead of print ','.join(...)
you do records.append([...])
) and then you can use writerows(records)
to write them to the file. No need to do something like '.'.join()
that's what csv does for you.
For the sake of completeness:
records = []
for record in SeqIO.parse(open(file, 'rU'), 'fasta', generic_protein):
record_id = re.sub(r'\d+_(\d+_\d\#\d+)_\d+', r'\1', record.id)
if ( not reference_sequence ):
reference_sequence = record.seq
reference_name = record_id
#continue
records.append([reference_name, record_id, compare_seqs(reference_sequence, record.seq)])
with csv.writer(open('file.csv', 'w')) as fp:
fp.writerows(records) # note that it's not writerow but writerows which allows you to write muptiple rows
Upvotes: 1
Reputation: 107287
You can suse writerow
to save your outputs with following :
for record in SeqIO.parse(open(file, 'rU'), 'fasta', generic_protein):
record_id = re.sub(r'\d+_(\d+_\d\#\d+)_\d+', r'\1', record.id)
if ( not reference_sequence ):
reference_sequence = record.seq
reference_name = record_id
#continue
line= ",".join([reference_name, record_id, compare_seqs(reference_sequence, record.seq)])
with open(csvfile, "w") as output:
writer = csv.writer(output, lineterminator='\n')
writer.writerow([line])
Upvotes: 1