Reputation: 399
I think the best way to explain my predicament is with an example. Say I have a file that contains this information:
isoform snp_rein
NM_005101 97
NM_005101 144
NM_198576 20790
and a dictionary that looks like so:
exons = {'NM_005101': [(0, 110), (517, 1073)], 'NM_198576': [(0, 251), (2078, 2340), (15154, 15202), (20542, 20758), (21050, 21275), (21355, 21580), (21833, 22040), (23116, 23335), (23415, 23610), (23700, 23901), (23986, 24135), (24211, 24317), (25038, 25155), (25236, 25401), (25610, 25754), (25841, 25966), (26037, 26143), (26274, 26613), (26697, 26835), (27204, 27332), (27450, 27565), (27653, 27773), (27889, 28243), (28744, 28937), (29113, 29329), (29443, 29673), (29780, 29915), (30110, 30207), (30304, 30469), (30603, 30715), (31130, 31247), (31330, 31523), (31605, 31693), (33630, 33855), (34325, 34429), (34701, 35997)]}
I've been working on some code that finds between which pair of numbers the snp_rein
number lies. I then was able to calculate the difference between the max of the first set of numbers and the min from the second set of numbers and so on. My code is as below:
totalintron=0
if name in exons:
y = exons[name]
for sd, i in enumerate(exons[name]):
if snpos<=max(i):
exonnumber = sd+1
position = sd
print exonnumber
break
for index in range(len(y) -1):
first_max = max(y[index])
second_min = min(y[index + 1])
intron = second_min - first_max
print intron
totalintron = totalintron + intron
print totalintron
totalintron = 0
My output looks as so: (**x**
indicates the exonnumber
and the last number indicates the total I want to change):
**1**
407
407
**2**
407
407
**5**
1827
12814
5340
292
80
253
1076
80
90
85
76
721
81
209
87
71
131
84
369
118
88
116
501
176
114
107
195
97
134
415
83
82
1937
470
272
28671
My problem lies with the total. I want to total only the amount of numbers specified by the exonnumber. For the first output I would want the total to read 0 because the tested number was within the specified ranged for exon 1. For the second output I want the total to read 407 because it was in the exon 2. For the last output, I want to sum the first 4 numbers because the the tested number was in exon 5.
This is what I want my output to look like:
**1**
0
**2**
407
**5**
20273
Any suggestions on how to change the way I total to only total for a specified amount of numbers, if that makes sense? Please explain what you suggest because I'm new to python...
Upvotes: 0
Views: 58
Reputation: 387667
You want that last inner loop to look like this:
# reset the `totalintron` for the current `exonnumber`
totalintron = 0
# only iterate `exonnumber - 1` (which is guaranteed to be len(y) - 1 at max)
for index in range(exonnumber - 1):
first_max = max(y[index])
second_min = min(y[index + 1])
intron = second_min - first_max
# don’t print `intron`, we only care about the total
totalintron = totalintron + intron
print totalintron
Upvotes: 1