sci-guy
sci-guy

Reputation: 2584

Checking to see if array elements are equal

In python how would I do this:

say I have:

a = [[1, 5], [2,6], [3,3], [4,2]]
b= [[3, 1], [4,2], [1,8], [2,4]]

Now I want to do an operation with the second column values IF the first column values match.

E.G.

a has an entry [1,5], now go through b to see oh it has a value [1,8], now I want to divide 5/8 and store that value into say array c. Next would be matching [2,6] and [2,4] and getting the next value in c: 6/4.

so:

c = [5/8, 6/4, 3/1, 2/2] 

Given the above example. I hope this makes sense. Would like to this with numpy and python.

Upvotes: 2

Views: 198

Answers (4)

Bi Rico
Bi Rico

Reputation: 25813

I humbly propose that you're using the wrong data structure. Notice that if you have an array column that has unique values between 1 and N (an index column) you could encode the same data simply by re-ordering your other columns. Once you're re-ordered your data, not only can you drop the "index" column but now it becomes easier to operate on the remaining data. Let me demonstrate:

import numpy as np

N = 5
a = np.array([[1, 5], [2,6], [3,3], [4,2]])
b = np.array([[3, 1], [4,2], [1,8], [2,4]])

a_trans = np.ones(N)
a_trans[a[:, 0]] = a[:, 1]

b_trans = np.ones(N)
b_trans[b[:, 0]] = b[:, 1]

c = a_trans / b_trans
print c

Depending on the nature of your problem, you can sometimes use an implicit index from the beginning, but sometimes an explicit index can be very useful. If you need an explicit index, consider using something like pandas.DataFrame with better support for index operations.

Upvotes: 0

Divakar
Divakar

Reputation: 221524

You can use np.searchsorted to get the positions where b's first column elements correspond to the a's first column elements and using that get the respective second column elements for division and finally get c. Thus, assuming a and b to be NumPy arrays, the vectorized implementation would be -

a0 = a[:,0]
c = np.true_divide(a[:,1],b[np.searchsorted(a0,b[:,0],sorter=a0.argsort()),1])

The approach listed above works for a generic case when the first column elements of a are not necessarily sorted. But, if they are sorted just like for the listed sample case, you can simply ignore the sorter input argument and have a simplified solution, like so -

c = np.true_divide(a[:,1],b[np.searchsorted(a0,b[:,0]),1])

Sample run -

In [35]: a
Out[35]: 
array([[1, 5],
       [2, 6],
       [3, 3],
       [4, 2]])

In [36]: b
Out[36]: 
array([[3, 1],
       [4, 2],
       [1, 8],
       [2, 4]])

In [37]: a0 = a[:,0]

In [38]: np.true_divide(a[:,1],b[np.searchsorted(a0,b[:,0],sorter=a0.argsort()),1])
Out[38]: array([ 0.625,  1.5  ,  3.   ,  1.   ])

Upvotes: 4

hilberts_drinking_problem
hilberts_drinking_problem

Reputation: 11602

Given all of the assumptions in the comment section, this will work:

from operator import itemgetter
from __future__ import division

a = [[1, 5], [2,6], [3,3], [4,2]]
b = [[3, 1], [4,2], [1,8], [2,4]]

result = [x / y for (_, x), (_, y) in zip(a, sorted(b, key=itemgetter(0)))]

Assumptions: lists have equal lengths, elements in the first position are unique for each list, first list is sorted by first element, every element that occurs in the first position in a also occurs in the first position in b.

Upvotes: 4

trans1st0r
trans1st0r

Reputation: 2073

You can use a simple O(n^2) way with nested loops:

c = []

for x in a:
 for y in b:
   if x[0] == y[0]:
     c.append(x[1]/y[1])
     break

The above is useful when the lists are small. For huge lists, consider a dictionary based approach, where the complexity would be O(n) at the cost of some extra space.

Upvotes: 1

Related Questions