Jason
Jason

Reputation: 1375

Using Python List Comprehension to match values that exist in a 2d array and simple list

I am attempting to match values of a 2d array with a list in order to create a new 2d array that contains the list and its corresponding values from the 2d array. Probably easier to understand in code than English...

import numpy as np

m_out = np.arange(50).reshape(25,2)
m_out_list = list(m_out[:,1])

eqn_out = range(7,17)

c_list = [(x,y) for x in eqn_out for y in m_out[:,0] if x in (m_out_list)]

print c_list

This code produces an answer I expect,

[(7, 0), (7, 2), (7, 4), (7, 6), ...

however it is not what I am attempting to accomplish. What I would like the last part of the list comprehension to do (or any other method that works) is to provide an array that matches the eqn_out list with it's corresponding unique original value; i.e

[(7,6), (9,8), (11,10), (13,12), (15,14), (17,16)]

I'm not sure how to do this exactly, any suggestions would be most appreciated.

Upvotes: 1

Views: 861

Answers (2)

Pierre GM
Pierre GM

Reputation: 20339

Something like that:

[(j, i) for (i,j) in m_out if j in eqn_out]

seems to work. However, it's probably a bit wasteful, as we're iterating on the whole m_out instead of a subset.

An alternative could be:

test = reduce(np.logical_or,v(m_out[:,1]==i for i in eqn_out))
[(j,i) for (i,j) in m_out[test]]

Here, we're iterating over len(eqn_out) boolean arrays, that we're combining in a single one with the reduce(np.logical_or, ...). We use this boolean array to select the items we want from m_out. Because you want the element of the second column to come first, we have to use the last list comprehension.

Note that it requires to create at least 2 N boolean arrays, which might be even more wasteful than the first solution... Both solutions can easily be applied to more columns than 2, though.

Upvotes: 3

DSM
DSM

Reputation: 353209

[Edited to put the simpler approach first.]

In practice I'd probably just do:

In [166]: d = dict(m_out[:,::-1])
In [167]: [(k, d[k]) for k in eqn_out if k in d]
Out[167]: [(7, 6), (9, 8), (11, 10), (13, 12), (15, 14), (17, 16)]

But for fun, sticking in numpy, how about something like:

[Updated: better numpy method]:

In [15]: m_out[np.in1d(v, eqn_out)][:, ::-1]
Out[15]: 
array([[ 7,  6],
       [ 9,  8],
       [11, 10],
       [13, 12],
       [15, 14],
       [17, 16]])

Or my original numpy approach:

In [150]: import numpy as np
In [151]: m_out = np.arange(50).reshape(25,2)   
In [152]: v = m_out[:,1]    
In [153]: eqn_out = np.arange(7, 18)     
In [154]: eqn_out
Out[154]: array([ 7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17])

Keep only the values we know about:

In [155]: np.intersect1d(eqn_out, v)
Out[155]: array([ 7,  9, 11, 13, 15, 17])

Find where they're located at (assumes the data is sorted!!):

In [156]: v.searchsorted(np.intersect1d(eqn_out, v))
Out[156]: array([3, 4, 5, 6, 7, 8])

Use these indices for selection purposes:

In [157]: m_out[v.searchsorted(np.intersect1d(eqn_out, v))]
Out[157]: 
array([[ 6,  7],
       [ 8,  9],
       [10, 11],
       [12, 13],
       [14, 15],
       [16, 17]])

Flip:

In [158]: m_out[v.searchsorted(np.intersect1d(eqn_out, v))][:,::-1]
Out[158]: 
array([[ 7,  6],
       [ 9,  8],
       [11, 10],
       [13, 12],
       [15, 14],
       [17, 16]])

Upvotes: 3

Related Questions