Reputation: 213
i have a dictionary called self.__sequences reads like "ID:DNA sequence", and the following is part of that dictionary
{'1111758': ('TTAGAGTTTGATCCTGGCTCAGAACGAACGCTGGCGGCAGGCCTAA\n', ''),
'1111762': ('AGAGTTTGATCCTGGCTCAGATTGA\n', ''),
'1111763': ('AGAGTTTGATCCTGGCCTT\n', '') }
I want to concatenate the values of the dictionary into one single string or sequence (no \n and no ""), that is, I want something like
"TTAGAGTTTGATCCTGGCTCAGAACGAACGCTGGCGGCAGGCCTAAAGAGTTTGATCCTGGCTCAGATTGAAGAGTTTGATCCTGGCCTT"
I write the following code, however, it does not give what I want. I guess it is because the value has two elements(DNA sequence and ""). I am struggling improving my code. Can anyone help me to make it work?
def sequence_statistics(self):
total_len=self.__sequences.values()[0]
for i in range(len(self.__sequences)):
total_len += self.__sequences.values()[i]
return total_len
Upvotes: 1
Views: 9987
Reputation: 17629
This will iterate over the sorted keys of your sequences, extract the first value of the tuples in the dict and strip whitespaces. Mind that dicts
are unordered in Python 2.7:
''.join(d[k][0].strip() for k in sorted(self.__sequences))
Upvotes: 2
Reputation: 31260
This is a generator that yields the first element of each value, with the "\n" stripped off:
(value[0].strip() for value in self.__sequences.values())
Since you probably want them sorted by keys, it becomes slightly more complicated:
(value[0].strip() for key, value in sorted(self.__sequences.items()))
And to turn that into a single string joined by '' (empty strings) in between, do:
''.join(value[0].strip() for key, value in sorted(self.__sequences.items()))
Upvotes: 1
Reputation: 16940
>>> d = {'1111758': ('TTAGAGTTTGATCCTGGCTCAGAACGAACGCTGGCGGCAGGCCTAA\n', ''),
... '1111762': ('AGAGTTTGATCCTGGCTCAGATTGA\n', ''),
... '1111763': ('AGAGTTTGATCCTGGCCTT\n', '') }
>>>
>>> lis = []
>>> for tup in d.values():
... lis.append(tup[0].rstrip('\n'))
...
>>> ''.join(lis)
'AGAGTTTGATCCTGGCTCAGATTGAAGAGTTTGATCCTGGCCTTTTAGAGTTTGATCCTGGCTCAGAACGAACGCTGGCGGCAGGCCTAA'
>>>
Upvotes: 1
Reputation: 2882
Try this code instead:
return "".join(v[0].strip() for k, v in self.__sequences.items())
Upvotes: 0