Garmekain
Garmekain

Reputation: 674

NumPy doesn't recognize well array shape

I have a code which is as follows:

data = np.array([[[i, j], i * j] for i in range(10) for j in range(10)])
print(data)

x = np.array(data[:,0])
x1 = x[:,0]
x2 = x[:,1]
print(x)

data correctly outputs [[[0,0],0],[[0,1],0],[[0,2],0],...,[[9,9],81]] which is, by the way, the multiplication table and it's results.

So, the first column of the data (which is x) must be separated into x1 and x2, which are the first and last column of it respectively. Which I think I did it right but it raises an error saying too many indices for array. What am I doing wrong?

Upvotes: 0

Views: 392

Answers (2)

B. M.
B. M.

Reputation: 18628

data.dtype is object because the elements of [[i,j],k] are not homogeneous. A workaround for you :

data = np.array([(i, j, i * j) for i in range(10) for j in range(10)])
print(data)

x1 = data[:,:2]
x2 = data[:,2]

data.shape is now (100,3), data.dtype is int and x1 and x2 what you want.

Upvotes: 1

hpaulj
hpaulj

Reputation: 231335

Because of the mix of list lengths, this produces an object array:

In [97]: data = np.array([[[i, j], i * j] for i in range(3) for j in range(3)])
In [98]: data
Out[98]: 
array([[[0, 0], 0],
       [[0, 1], 0],
       [[0, 2], 0],
       [[1, 0], 0],
       [[1, 1], 1],
       [[1, 2], 2],
       [[2, 0], 0],
       [[2, 1], 2],
       [[2, 2], 4]], dtype=object)
In [99]: data.shape
Out[99]: (9, 2)

One column contains numbers (but is still object dtype), the other lists. Both have (9,) shape

In [100]: data[:,1]
Out[100]: array([0, 0, 0, 0, 1, 2, 0, 2, 4], dtype=object)
In [101]: data[:,0]
Out[101]: 
array([[0, 0], [0, 1], [0, 2], [1, 0], [1, 1], [1, 2], [2, 0], [2, 1],
       [2, 2]], dtype=object)

The easiest way of turning that column into a numeric arrays is via .tolist

In [104]: np.array(data[:,0].tolist())
Out[104]: 
array([[0, 0],
       [0, 1],
       [0, 2],
       [1, 0],
       [1, 1],
       [1, 2],
       [2, 0],
       [2, 1],
       [2, 2]])
In [105]: _.shape
Out[105]: (9, 2)

The [i, j, i * j] elements as suggested in the other answer are easier to work with.


A structured array approach to generating such a 'table':

In [113]: dt='(2)int,int'
In [114]: data = np.array([([i, j], i * j) for i in range(3) for j in range(3)],
     ...: dtype=dt)
In [115]: data
Out[115]: 
array([([0, 0], 0), ([0, 1], 0), ([0, 2], 0), ([1, 0], 0), ([1, 1], 1),
       ([1, 2], 2), ([2, 0], 0), ([2, 1], 2), ([2, 2], 4)], 
      dtype=[('f0', '<i4', (2,)), ('f1', '<i4')])
In [116]: data['f0']
Out[116]: 
array([[0, 0],
       [0, 1],
       [0, 2],
       [1, 0],
       [1, 1],
       [1, 2],
       [2, 0],
       [2, 1],
       [2, 2]])
In [117]: data['f1']
Out[117]: array([0, 0, 0, 0, 1, 2, 0, 2, 4])

Upvotes: 1

Related Questions