Reputation: 47
I'm trying to parse DNA strings.
The input.txt contains:
Rosalind_6404CCTGCGGAAGATCGGCACTAGAATAGCCAGAACCGTTTCTCTGAGGCTTCCGGCCTTCCCTCCCACTAATAATTCTGAGG>Rosalind_5959CCATCGGTAGCGCATCCTTAGTCCAATTAAGTCCCTATCCAGGCGCTCCGCCGAAGGTCTATATCCATTTGTCAGCAGACACGC>Rosalind_0808CCACCCTCGTGGTATGGCTAGGCATTCAGGAACCGGAGAACGCTTCAGACCAGCCCGGACTGGGAACCTGCGGGCAGTAGGTGGAAT
The code is:
f = open('input.txt', 'r')
raw_samples = f.readlines()
f.close()
samples = {}
cur_key = ''
for elem in raw_samples:
if elem[0] == '>':
cur_key = elem[1:].rstrip()
samples[cur_key] = ''
else:
samples[cur_key] = samples[cur_key] + elem.rstrip()
print(samples)
for p_id, s in samples.values():
samples[s_id] = (s.count('G') + s.count('C'))*100
print (samples)`
I keep getting the error:
File "C:/Python34/test.py", line 18, in <module> for p_id, s in samples.values(): ValueError: too many values to unpack (expected 2)
Upvotes: 0
Views: 302
Reputation: 452
import csv
reader = csv.reader(open("input.txt"), delimiter=">", quotechar="'")
dkeys = [item for item in next(reader) if item.strip()]
dvalues = [(item.count('G')+item.count('C')*100) for item in dkeys]
print(dict(zip(dkeys, dvalues)))
I hope it's useful. :D
Upvotes: 1
Reputation: 47
I was able to solve the problem by changing
for p_id, s in samples.values()
to
for p_id, s in samples.items()
I also noticed that p_id and s_id were different, they were meant to be the same.
Upvotes: 1