user3234810
user3234810

Reputation: 482

Writing list to CSV

This code

for record in SeqIO.parse(open(file, 'rU'), 'fasta', generic_protein):
    record_id = re.sub(r'\d+_(\d+_\d\#\d+)_\d+', r'\1', record.id)

    if ( not reference_sequence ):
      reference_sequence = record.seq
      reference_name     = record_id
      #continue
    print ",".join([reference_name, record_id, compare_seqs(reference_sequence, record.seq)])

gives terminal output that looks like

7065_8#1,8987_2#53,
7065_8#1,8987_2#58,
7065_8#1,8987_2#61,
7065_8#1,8987_2#62,E-G [246]
7065_8#1,8987_2#65,N-K [71],Y-D [223]

I want to write this line by line to a CSV, any suggestions?

Upvotes: 0

Views: 79

Answers (3)

Anant Gupta
Anant Gupta

Reputation: 518

You can also write the comma separated string (along with quotechar) directly to the file:

f = open("output.csv","w")
for record in SeqIO.parse(open(file, 'rU'), 'fasta', generic_protein):
  record_id = re.sub(r'\d+_(\d+_\d\#\d+)_\d+', r'\1', record.id)

  if ( not reference_sequence ):
    reference_sequence = record.seq
    reference_name     = record_id
    #continue
  csvrow =  '","'.join([reference_name, record_id, compare_seqs(reference_sequence, record.seq)])
  csvrow = '"'+csvrow+'"'
  print >>f, csvrow
f.close()

Using this method, you can open the file and check if the data is being written, even when the script is running.

Upvotes: 1

a_guest
a_guest

Reputation: 36239

Pack all records in a nested list (i.e. instead of print ','.join(...) you do records.append([...])) and then you can use writerows(records) to write them to the file. No need to do something like '.'.join() that's what csv does for you.

For the sake of completeness:

records = []
for record in SeqIO.parse(open(file, 'rU'), 'fasta', generic_protein):
    record_id = re.sub(r'\d+_(\d+_\d\#\d+)_\d+', r'\1', record.id)

    if ( not reference_sequence ):
      reference_sequence = record.seq
      reference_name     = record_id
      #continue
    records.append([reference_name, record_id, compare_seqs(reference_sequence, record.seq)])

with csv.writer(open('file.csv', 'w')) as fp:
    fp.writerows(records)  # note that it's not writerow but writerows which allows you to write muptiple rows

Upvotes: 1

Kasravnd
Kasravnd

Reputation: 107287

You can suse writerow to save your outputs with following :

for record in SeqIO.parse(open(file, 'rU'), 'fasta', generic_protein):
    record_id = re.sub(r'\d+_(\d+_\d\#\d+)_\d+', r'\1', record.id)

if ( not reference_sequence ):
  reference_sequence = record.seq
  reference_name     = record_id
  #continue
line= ",".join([reference_name, record_id, compare_seqs(reference_sequence, record.seq)])
with open(csvfile, "w") as output:
    writer = csv.writer(output, lineterminator='\n')
    writer.writerow([line]) 

Upvotes: 1

Related Questions