Numpy array creation

Question

First off I will apologize to the arbitraryness of this question but I am rewriting some of my scripts to use Numpy arrays instead of nested python lists (for performance and memory) but I'm still struggling with their declaration.

I am trying to create a structure using numpy arrays, I am starting off with 1000 (arbitrary value) elements in the array where each element should contain a float (as [x][0]) and a nested array containing coordinates (so 10.0000 x 2 floats PER top level element) (as [x][1], with each element in the nested array accessible as [x][1][y][z] where y is the element in nested array and z specified which of the 2 coordinates). The following question Nested Structured Numpy Array creates a nigh identical structure (as reference for my question and my desired structure).

Schematic raw data example:

time 0
  m/z 10 int 10
  m/z 20 int 20
  m/z 30 int 1000
  ...
time 1

I have read that i haveto use the dtype part to define the nested array but I am not quite sure on the declaration part of the dimensions for an empty array, could anyone give me a hand? Here is what I came up with so far.

data=np.zeroes((1000,2 /* Now add nested array */), dtype=[('time', 'f'), [('m/z','f'), ('intensity','f')]])

PS: A matrix might be a better option for this?

Janne Karila · Accepted Answer

>>> a = np.zeros(1000, dtype='float32, (10000,2)float32')
>>> a[200][0]
0.0
>>> a[200][1][2000]
array([ 0.,  0.], dtype=float32)

Note that this creates 1000 arrays of dimension (10000,2). That's fine if you only ever do operations that look at just one of those arrays. Using a separate (1000,10000,2) array instead you could take better advantage of vectorized operations in NumPy. You could for example increment all the second coordinates in one operation like this:

>>> b = np.zeros((1000,10000,2))
>>> b[:,:,1] += 1

Trying to do the same with a[:][1][:,1] is an error.

Numpy array creation

Answers (1)

Related Questions