neymar
neymar

Reputation: 213

concatenate the values of dictionary into single string or sequence

i have a dictionary called self.__sequences reads like "ID:DNA sequence", and the following is part of that dictionary

{'1111758': ('TTAGAGTTTGATCCTGGCTCAGAACGAACGCTGGCGGCAGGCCTAA\n', ''), 
'1111762': ('AGAGTTTGATCCTGGCTCAGATTGA\n', ''),
'1111763': ('AGAGTTTGATCCTGGCCTT\n', '') }

I want to concatenate the values of the dictionary into one single string or sequence (no \n and no ""), that is, I want something like

"TTAGAGTTTGATCCTGGCTCAGAACGAACGCTGGCGGCAGGCCTAAAGAGTTTGATCCTGGCTCAGATTGAAGAGTTTGATCCTGGCCTT"

I write the following code, however, it does not give what I want. I guess it is because the value has two elements(DNA sequence and ""). I am struggling improving my code. Can anyone help me to make it work?

def sequence_statistics(self):
    total_len=self.__sequences.values()[0]
    for i in range(len(self.__sequences)):
        total_len += self.__sequences.values()[i]
    return total_len

Upvotes: 1

Views: 9987

Answers (4)

Matt
Matt

Reputation: 17629

This will iterate over the sorted keys of your sequences, extract the first value of the tuples in the dict and strip whitespaces. Mind that dicts are unordered in Python 2.7:

''.join(d[k][0].strip() for k in sorted(self.__sequences))

Upvotes: 2

RemcoGerlich
RemcoGerlich

Reputation: 31260

This is a generator that yields the first element of each value, with the "\n" stripped off:

(value[0].strip() for value in self.__sequences.values())

Since you probably want them sorted by keys, it becomes slightly more complicated:

(value[0].strip() for key, value in sorted(self.__sequences.items()))

And to turn that into a single string joined by '' (empty strings) in between, do:

''.join(value[0].strip() for key, value in sorted(self.__sequences.items()))

Upvotes: 1

James Sapam
James Sapam

Reputation: 16940

>>> d = {'1111758': ('TTAGAGTTTGATCCTGGCTCAGAACGAACGCTGGCGGCAGGCCTAA\n', ''),
... '1111762': ('AGAGTTTGATCCTGGCTCAGATTGA\n', ''),
... '1111763': ('AGAGTTTGATCCTGGCCTT\n', '') }
>>>
>>> lis = []
>>> for tup in d.values():
...     lis.append(tup[0].rstrip('\n'))
...
>>> ''.join(lis)
  'AGAGTTTGATCCTGGCTCAGATTGAAGAGTTTGATCCTGGCCTTTTAGAGTTTGATCCTGGCTCAGAACGAACGCTGGCGGCAGGCCTAA'
>>>

Upvotes: 1

arocks
arocks

Reputation: 2882

Try this code instead:

return "".join(v[0].strip() for k, v in self.__sequences.items())

Upvotes: 0

Related Questions