Peter Hanson
Peter Hanson

Reputation: 193

Quick basic loop

I have the following lines in a file. Here is an example of one line:

NM_???? chr12 - 10 110 10 110 3 10,50,100, 20,60,110,

I have the following code to get the info out:

fp = open(infile, 'r')
for line in fp:
     tokens = line.split()
     exonstarts = tokens[8][:-1].split(',')
     exonends = tokens[9][:-1].split(',')

This will give me a list like these:

exonstarts = [10,50,100]
exonends = [20,60,110]

This has 3 exons (ALTHOUGH OTHER LINES IN THE FILE MAY HAVE MORE OR LESS THAN 3, so this must work for any number of exons), and they go from:

 10-20
 50-60
 100-110

So for each number in the start list there is one in the finish list. Which means that the first codon start at exonstarts[0] and ends at exonends[0]. The second starts at exonstarts[1] and ends at exonends[1]. And so on.

How do I write the rest of this code so it pairs up the elements as such?


Update:

From this:

tokens = line.split()
exonstarts = tokens[8][:-1].split(',')
exonends = tokens[9][:-1].split(',')
zipped = list(zip(exonstarts, exonends))

I have another problem, I have a sting that I want these pieces of. So for example, I would want chr_string[10:20]+chr_string[50:60]+chr_string[100:110] Is there a way I could easily say this??

Upvotes: 2

Views: 140

Answers (3)

johnsyweb
johnsyweb

Reputation: 141810

You can get these pairs using zip():

>>> for t in zip(exonstarts, exonends):
...     print('%d-%d' % t)
... 
10-20
50-60
100-110

To get a list by slicing chr_string (which I have fabricated) using these pairs:

>>> [chr_string[start:end] for start,end in zip(exonstarts, exonends)]
['0506070809', '2526272829', '5051525354']

To join these together:

>>> ''.join(chr_string[start:end] for start,end in zip(exonstarts, exonends))
'050607080925262728295051525354'

Upvotes: 0

garnertb
garnertb

Reputation: 9584

The zip built-in is what your looking for:

>>> exonstarts = [10,50,100]
>>> exonends = [20,60,110]
>>> zip(exonstarts,exonends)
[(10, 20), (50, 60), (100, 110)]

Upvotes: 3

Bill Lynch
Bill Lynch

Reputation: 81936

I believe you want the zip function.

In [1]: exonstarts = [10,50,100]

In [2]: exonends = [20,60,110]

In [3]: zip(exonstarts, exonends)
Out[3]: [(10, 20), (50, 60), (100, 110)]

Upvotes: 2

Related Questions