Finding Interfaces

Question

Supposing i have a dictionary that I have created using defaultdict like this one(many hundred points long):

L [(32.992, 22.861, 29.486, 'TYR'), (32.613, 26.653, 29.569, 'VAL'), (30.029, 28.873,     27.872, 'LEU')
A [(1.719, -25.217, 8.694, 'PRO'), (2.934, -21.997, 7.084, 'SER'), (5.35, -19.779, 8.986, 'VAL')
H [(-0.511, 19.577, 27.422, 'GLU'), (2.336, 18.416, 29.649, 'VAL'), (2.65, 19.35, 33.322, 'GLN')

I then want to loop over every value in each key and check the distance from that value to every other residue under the other keys. I know how to check the distance using a simple formula, but I am having problems getting the loop to work properly. Ideally it would check each point in L against each point in A and H and then move on to checking each_value in A against L and H etc...

I have tried simple comprehensions but I always get only one chain to check one other chain before it finishes. Any help is appreciated.

I am extracting coordinates from a PDB file like this:

xyz = []

for line in pdb_file:
    if line.startswith("ATOM"):
        # get x, y, z coordinates for Cas
        chainid = str((line[20:23].strip()))
        atomid = str((line[16:20].strip()))
        pdbresn= int(line[23:26].strip())
        x = float(line[30:38].strip())
        y = float(line[38:46].strip())
        z = float(line[46:54].strip())
        if line[12:16].strip() == "CA":
            xyz.append((chainid,x,y,z,atomid))

and then putting it into a dictionary like this:

d = defaultdict(list)
for c,x,y,z,cid in xyz:
    d[c].append((x,y,z,cid))

giving this when i `print(d):

defaultdict(, {'L': [(32.992, 22.861, 29.486, 'TYR'), (32.613, 26.653, 29.569, 'VAL'), (30.029, 28.873, 27.872, 'LEU')],'H': [(30.254, 32.655, 27.849, 'THR'), (27.487, 35.089, 27.0, 'GLN'), (27.343, 38.9, 27.424, 'PRO')], 'A': [(25.621, 40.067, 30.641, 'PRO'), (23.161, 42.327, 28.82, 'SER'), (22.086, 43.358, 25.326, 'VAL'), (20.081, 46.519, 24.785, 'SER'), (18.23, 46.826, 21.488, 'VAL')]})

abarnert · Accepted Answer

I'm going to take a guess at what you want here.

You have a dictionary, something like this:

{'L': [(32.992, 22.861, 29.486, 'TYR'), (32.613, 26.653, 29.569, 'VAL'), (30.029, 28.873,     27.872, 'LEU')],
 'A': [(1.719, -25.217, 8.694, 'PRO'), (2.934, -21.997, 7.084, 'SER'), (5.35, -19.779, 8.986, 'VAL')],
 'H': [(-0.511, 19.577, 27.422, 'GLU'), (2.336, 18.416, 29.649, 'VAL'), (2.65, 19.35, 33.322, 'GLN')]}

What you want is something like this pseudocode:

for each list in the dictionary's values:
    for each of the other two lists:
        for each element in the first list:
            for each element in the other list:
                do something with the distance between the two elements.

In this case, "each of the other two lists" is pretty simple, because there are only 3… but in general, it's simpler to do this as:

for each pair of lists in the dictionary's values:
    for each pair of elements from the cartesian product of the two lists:
        do something with the distance between the two elements

And you can translate that directly to Python:

from itertools import permutations, product, chain

for lst1, lst2 in permutations(d.values(), 2):
    for e1, e2 in chain.from_iterable(product(lst1, lst2)):
        do_something_with(dist(e1, e2))

If you just want to get collect those distances into a list with as a comprehension, that's easy:

distances = [dist(e1, e2) for lst1, lst2 in permutations(d.values(), 2)
             for e1, e2 in chain.from_iterable(product(lst1, lst2))]

However, I think it might be more readable like this:

list_pairs = permutations(d.values(), 2)
item_pairs = chain.from_iterable(product(lst1, lst2) for lst1, lst2 in list_pairs)
distances = [dist(e1, e2) for e1, e2 in item_pairs]

Finding Interfaces

Answers (1)

Related Questions