Homap
Homap

Reputation: 2214

Pairwise comparison of elements in a list python

Given a list below:

snplist = [[1786, 0.0126525], [2463, 0.0126525], [2907, 0.0126525], [3068, 0.0126525], [3086, 0.0126525], [3398, 0.0126525], [5468,0.012654], [5531,0.0127005], [5564,0.0127005], [5580,0.0127005]]

I want to do a pairwise comparison of the second element in each sublist of the list, i.e. compare to see 0.0126525 from [1786, 0.0126525] is equal to 0.0126525 from [2463, 0.0126525] and so forth, if so, print the output as indicated in the code.

Using for loop, I achieve the result:

for index, item in enumerate(snplist, 0):
    if index < len(snplist)-1:
        if snplist[index][1] == snplist[index+1][1]:
            print snplist[index][0], snplist[index+1][0], snplist[index][1]

When doing pairwise comparisons of the elements of a loop using list index, I always get into the error of 'index out of range' because of the last element. I solve this problem by adding a condition

if index < len(snplist)-1:

I don't think this is the best way of doing this. I was wondering if there are more elaborate ways of doing pairwise comparisons of list elements in python?

EDIT: I had not thought about the level of tolerance when comparing floats. I would consider two floats with 0.001 difference as being equal.

Upvotes: 4

Views: 5790

Answers (2)

thefourtheye
thefourtheye

Reputation: 239473

You can zip the snplist with the same list excluding the first element, and do the comparison, like this

for l1, l2 in zip(snplist, snplist[1:]):
    if l1[1] == l2[1]:
      print l1[0], l2[0], l1[1]

Since you are comparing floating point numbers, I would recommend using math.isclose function from Python 3.5, like this

def isclose(a, b, rel_tol=1e-09, abs_tol=0.0):
    return abs(a-b) <= max(rel_tol * max(abs(a), abs(b)), abs_tol)

As you want to have 0.001 tolerance, you can do the comparison like this

if is_close(l1[1], l2[1], 0.001):

Upvotes: 10

timgeb
timgeb

Reputation: 78690

I suggest that you use izip for this to create a generator of item-neighbor pairs. Leaving the problem of comparing floating points aside, the code would look like this:

>>> from itertools import izip
>>> lst = [[1,2], [3,4], [5,4], [7,8], [9,10], [11, 10]]
>>> for item, next in izip(lst, lst[1:]):
...     if item[1] == next[1]:
...         print item[0], next[0], item[1]
... 
3 5 4
9 11 10

Remember to specify a tolerance when comparing floats, do not compare them with == !

You could define an almost_equal function for this, for example:

def almost_equal(x, y, tolerance):
    return abs(x-y) < tolerance

Then in the code above, use almost_equal(item[1], next[1], tolerance) instead of the comparison with ==.

Upvotes: 4

Related Questions