Reputation: 33
I have a 2D array of arrays defined as follows:
traces = [['x1',11026,0,0,0,0],
['x0',11087,0,0,0,1],
['x0',11088,0,0,1,3],
['x0',11088,0,0,0,3],
['x0',11088,0,1,0,1]]
I want to find the index of the row which matches multiple conditions of selected columns. For example I want to find the row in this array where
row[0]=='x0' & row[1]==11088 & row[3]==1 & row[5]=1
Searching on this criteria should return 4.
I attempted to use numpy.where but can't seem to make it work with multiple conditions
print np.where((traces[:,0] == 'x0') & (traces[:,1] == 11088) & (traces[:,3] == 1) & (traces[:,5] == 1))
The above creates the warning
FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison print np.where((traces[:,0] == 'x0') & (traces[:,1] == 11088) & (traces[:,3]
== 1) & (traces[:,5] == 1)) (array([], dtype=int32),)
I've attempted to use numpy.logical_and
as well and that doesn't seem to work either, creating similar warnings.
Any way I can do this using numpy.where
without iterating over the whole 2D array?
Thanks
Upvotes: 3
Views: 5670
Reputation: 23637
I strongly assume you did something like this (conversion to np.array
):
traces = [['x1',11026,0,0,0,0],
['x0',11087,0,0,0,1],
['x0',11088,0,0,1,3],
['x0',11088,0,0,0,3],
['x0',11088,0,1,0,1]]
traces = np.array(traces)
This exhibits the described error. The reason can be seen by printing the resulting array:
print(traces)
# array([['x1', '11026', '0', '0', '0', '0'],
# ['x0', '11087', '0', '0', '0', '1'],
# ['x0', '11088', '0', '0', '1', '3'],
# ['x0', '11088', '0', '0', '0', '3'],
# ['x0', '11088', '0', '1', '0', '1']],
# dtype='<U5')
Numbers were converted to strings!
When constructing an array that contains values of different types, numpy usually creates an array of dtype=object
. This works in most cases but has bad performance.
However, in this case numpy apparently tried to be smart and converted the data to a string type, which is more specific than object but general enough to take numbers - as strings.
As a solution construct the array explicitly as an "object array":
traces = np.array(traces, dtype='object')
print(np.where((traces[:,0] == 'x0') & (traces[:,1] == 11088) & (traces[:,3] == 1) & (traces[:,5] == 1)))
# (array([4], dtype=int32),)
Note that although this works, object arrays are often not a good idea to use. Consider instead to replace the strings in the first column with numeric values.
Upvotes: 3
Reputation: 18940
Consider this comparison:
>>> traces[:,[0,1,3,5]] == ['x0', 11088, 1, 1]
array([[False, False, False, False],
[ True, False, False, True],
[ True, True, False, False],
[ True, True, False, False],
[ True, True, True, True]])
we are looking for one (or more) row(s) with all values equal to True:
>>> np.where(np.all(traces[:,[0,1,3,5]] == ['x0', 11088, 1, 1], axis=1))
(array([4]),)
Upvotes: 2