Goldee
Goldee

Reputation: 13

What kind of array slice (?) is this?

I'm looking at some code and I see:

y = X[class_member_mask & core_samples_mask]

For what it's worth, type(X)=<type 'numpy.ndarray'> and len(X)=150).

What is y? What kind of "slicing" is this? Does it remove or adjust certain members of X? Which and why?

Upvotes: 1

Views: 91

Answers (1)

user2357112
user2357112

Reputation: 281207

It's not possible to tell with 100% certainty what's going on from the code we can see, but this looks like a NumPy advanced indexing operation.

When a NumPy array is indexed with an identically-shaped array of booleans, like so:

>>> x = numpy.array([[1, 2],
...                  [3, 4]])
>>> index_array = numpy.array([[True, False],
...                            [False, True]])
>>> x[index_array]
array([1, 4])

the result is an array of elements corresponding to each position of x where the index array had a True element. The elements appear in the result array in the same order they appear in the flattened version of x. The result is not a view; modifying it will not affect x. (This is a special case of more general behavior for when x and index_array are not identically-shaped, but the shapes are probably identical here, and the full behavior is really hard to understand.)

& is the bitwise and operator. For booleans, this is pretty much the same as the regular and operator. For identically-shaped NumPy arrays of booleans:

>>> x = numpy.array([True, False, True])
>>> y = numpy.array([True, True, False])
>>> x & y
array([ True, False, False], dtype=bool)

it goes through and ands corresponding elements to create an array of results. (Again, this is a special case of much more general behavior, but explaining the full generalities would quadruple the length of the post.)

Putting it together, we can guess that class_member_mask and core_samples_mask are boolean arrays representing which elements of X satisfy certain criteria.
class_member_mask & core_samples_mask then creates an array representing which elements of X fit both conditions, and y = X[class_member_mask & core_samples_mask] selects all elements of X fitting both criteria.

Upvotes: 2

Related Questions