Finding nearest set of numbers given a position

Question

I have a dictionary that looks something like so:

exons = {'NM_015665': [(0, 225), (356, 441), (563, 645), (793, 861)], etc...}

and another file that has a position like so:

isoform    pos    
NM_015665    449

What I want to do is print the range of numbers that the position in the file is the closest to and then print the number within that range of numbers that the value is closest to. For this case, I want to print (356, 441) and then 441. I've successfully figured out a way to print the number in the set of numbers that the value is closest to, but my code below only takes into account 10 values on either side of the numbers listed. Is there any way to take into account that there are a different amount of numbers between each set of ranges?

This is the code I have so far:

with open('splicing_reinitialized.txt') as f:
    reader = csv.DictReader(f,delimiter="	")
    for row in reader:
        pos = row['pos']
        name = row['isoform']
        ppos1 = int(pos)
        if name in exons:
            y = exons[name]
            for i, (low,high) in enumerate(exons[name]):
                if low -5 <= ppos1 <= high + 5:
                    values = (low,high)
                    closest = min((low,high), key = lambda x:abs(x-ppos1))

Imanol Luengo · Accepted Answer

I would rewrite it as a minimum distance search:

if name in exons:
    y = exons[name]
    minDist = 99999 # large number
    minIdx = None
    minNum = None
    for i, (low,high) in enumerate(y):
        dlow = abs(low - ppos1)
        dhigh = abs(high - ppos1)
        dist = min(dlow, dhigh)
        if dist < minDist:
            minDist = dist
            minIdx = i
            minNum = 0 if dlow < dhigh else 1
    print(y[minIdx])
    print(y[minIdx][minNum])

This ignores the search range, just search for the minimum distance pair.

Finding nearest set of numbers given a position

Answers (2)

Related Questions