Python and Numeric/numpy Array Slicing Behavior

Question

On Python2.4, the single colon slice operator : works as expected on Numeric matrices, in that it returns all values for the dimension it was used on. For example all X and/or Y values for a 2-D matrix.

On Python2.6, the single colon slice operator seems to have a different effect in some cases: for example, on a regular 2-D MxN matrix, m[:] can result in zeros(, 'l') being returned as the resulting slice. The full matrix is what one would expect - which is what one gets using Python2.4.

Using either a double colon :: or 3 dots ... in Python2.6, instead of a single colon, seems to fix this issue and return the proper matrix slice.

After some guessing, I discovered you can get the same zeros output when inputting 0 as the stop index. e.g. m[:0] returns the same "zeros" output as m[:]. Is there any way to debug what indexes are being picked when trying to do m[:]? Or did something change between the two Python versions (2.4 to 2.6) that would affect the behavior of slicing operators?

The version of Numeric being used (24.2) is the same between both versions of Python. Why does the single colon slicing NOT work on Python 2.6 the same way it works with version 2.4?

Python2.6:

>>> a = array([[1,2,3],[4,5,6]])
**>>> a[:]
zeros((0, 3), 'l')**

>>> a[::]
array([[1,2,3],[4,5,6]])

>>> a[...]
array([[1,2,3],[4,5,6]])

Python2.4:

>>> a = array([[1,2,3],[4,5,6]])
**>>> a[:]
array([[1,2,3],[4,5,6]])**

>>> a[::]
array([[1,2,3],[4,5,6]])

>>> a[...]
array([[1,2,3],[4,5,6]])

(I typed the "code" up from scratch, so it may not be fully accurate syntax or printout-wise, but shows what's happening)

David G. · Accepted Answer

It seems the problem is an integer overflow issue. In the Numeric source code, the matrix data structure being used is in a file called MA.py. The specific class is called MaskedArray. There is a line at the end of the class that sets the "array()" function to this class. I had much trouble finding this information but it turned out to be very critical.

There is also a getslice(self, i, j) method in the MaskedArray class that takes in the start/stop indices and returns the proper slice. After finding this and adding debug for those indices, I discovered that under the good case with Python2.4, when doing a slice for an entire array the start/stop indices automatically input are 0 and 2^31-1, respectively. But under Python2.6, the stop index automatically input changed to be 2^63-1.

Somewhere, probably in the Numeric source/library code, there is only 32 bits to store the stop index when slicing arrays. Hence, the 2^63-1 value was overflowing (but any value greater than 2^31 would overflow). The output slice in these bad cases ends up being equivalent to slicing from start 0 to stop 0, e.g. an empty matrix. When you slice from [0:-1] you do get a valid slice. I think (2^63 - 1) interpreted as a 32 bit number would come out to -1. I'm not quite sure why the output of slicing from 0 to 2^63-1 is the same as slicing from 0 to 0 (where you get an empty matrix), and not from 0 to -1 (where you get at least some output).

Although, if I input ending slice indexes that would overflow (i.e. greater than 2^31), but the lower 32 bits were a valid positive non-zero number, I would get a valid slice back. E.g. a stop index of 2^33+1 would return the same slice as a stop index of 1, because the lower 32 bits are 1 in both cases.

Python 2.4 Example code:

>>> a = array([[1,2,3],[4,5,6]])
>>> a[:]             # (which actually becomes a[0:2^31-1])
[[1,2,3],[4,5,6]]    # correct, expect the entire array

Python 2.6 Example code:

>>> a = array([[1,2,3],[4,5,6]])
>>> a[:]             # (which actually becomes a[0:2^63-1])
zeros((0, 3), 'l')   # incorrect b/c of overflow, should be full array
>>> a[0:0]
zeros((0, 3), 'l')   # correct, b/c slicing range is null
>>> a[0:2**33+1]
[ [1,2,3]]           # incorrect b/c of overflow, should be full array
                     # although it returned some data b/c the
                     # lower 32 bits of (2^33+1) = 1
>>> a[0:-1]
[ [1,2,3]]           # correct, although I'm not sure why "a[:]" doesn't
                     # give this output as well, given that the lower 32
                     # bits of 2^63-1 equal -1

Python and Numeric/numpy Array Slicing Behavior

Answers (2)

Related Questions